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TGA ACA AGA GAG TGC TCA AGA AGC TGT CCA AGG ACG GCT CCA GAG AGG 48 

* Thr Arg Glu cya Sar Arg sar Cys pro Arg Thr Ala Pro Gin Arg 
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GGG TCG ACT TTA GCC AGG CGG AAA AGG AGC GCC GGG GCT GGC AGC CAC 144 
Gly ser Thr Leu Ala Arg Arg Ly» Arg Bar Ala Gly Ala Gly Sar Hia 
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TGT CAA AAG ACC TCC CTG CGG GTA AAC TTC GAG CAC ATC GGC TCG GAC 192 

Cya Gin Lya Thr Sar Lau Arg Val Aan Pha Glu Aap Ila Gly Trp Asp 
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AGC TCG ATC ATT GCA CCC AAG GAG TAT 6AA GCC TAC GAG TGT AAG GGC 240 
s«r Trp lis Zla Ala Pro Lya Glu Tyr Glu Ala Tyr Glu Cya Lya Gly 
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GGC TGC TTC TTC CCC TTC GCT GAC CAT GTG ACG CCC ACG AAA CAC GCT 288 
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ATC GTG CAG ACC CTG GTG CAT CTC AAG TTC CCC ACA AAG GTG GGC AAG 336 
Zla Val Gin Thr Lau Val His Lau Lys Phs Pro Thr Lys Val Gly Lya 
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GTG GCA GAG TGT GGG TGC AGG TAGTATCTGC CTGCGGG 470 
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(57) Abstract 



Purified BMP-9 proteins and processes for producing them are disclosed. The proteins may be used in the treatment of 
bone and cartilage defects and in wound healing and related tissue repair. 
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BMP-9 COMPOSITIONS 

The present invention relates to a novel family of purified 
proteins designated BMP-9 proteins and processes for obtaining 
them. These proteins may be used to induce bone and/or 
cartilage formation and in wound healing and tissue repair. 

The murine BMP-9 DNA sequence (SEQ ID NO: 1). and amino 
acid sequence (SEQ ID NO: 2) are set forth in Figure l. Human 
15 BMP-9 sequence is set forth in Figure 3 (SEQ ID NO: 8 and SEQ 
ID NO: 9) • It is contemplated that BMP-9 proteins are capable 
of inducing the formation of cartilage and/or bone. BMP-9 
proteins may be further characterized by the ability to 
demonstrate cartilage and/ or bone formation activity in the rat 
20 bone formation assay described below. 

Murine BMP-9 is characterized "by comprising amino acid 
#319 to #428 of Figure 1 (SEQ ID NO: 2 amino acids #1-110) . 
Murine BMP-9 may be produced by culturing a cell transformed 
with a DNA* sequence comprising nucleotide #610 to nucleotide 
25 #1893 as shown in Figure 1 (SEQ ID NO: 1) and recovering and 
purifying from the culture medium a protein characterized by 
the amino acid sequence comprising amino acid #319 to #428 as 
shown in Figure 1 (SEQ ID NO: 2) substantially free from other 
proteinaceous materials with which it is co-produced. 
30 Human BMP-9 is expected to be homologous to murine BMP-9 

and is characterized by comprising amino acid #1 (Ser, Ala, 
Gly) to #110 of Figure 3 (SEQ ID NO: 9) (Arg) . The invention 
includes methods for obtaining the DNA sequences encoding human 
BMP-9. This method entails utilizing the murine BMP-9 
35 nucleotide sequence or portions thereof to design probes to 
screen libraries for the human gene or fragments thereof using 
standard techniques. Human BMP-9 may be produced by culturing 
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a cell transformed with the BMP-9 DNA sequence and recovering 
and purifying BMP-9 from the culture medium. The expressed 
protein is isolated, recovered, and purified from the culture 
medium. The purified expressed protein is substantially free 
5 from other proteinaceous materials with which it is co- 
produced, as well as from other contaminants. The recovered 
purified protein is contemplated to exhibit cartilage and/or 
bone formation activity. The proteins of the invention may be 
further characterized by the ability to demonstrate cartilage 

10 and/or bone formation activity in the rat bone formation assay 
described below. 

Human BMP-9 may be produced by culturing a cell 
transformed with a DNA sequence comprising nucleotide #124 to 
#453 as shown in SEQ ID NO: 8 and recovering and purifying from 

15 the culture medium a protein characterized by the amino acid 
sequence of SEQ ID NO: 9 from amino acid #1 to amino acid #110 
substantially free from other proteinaceous materials with 
which it is co-produced. 

Another aspect of the invention provides pharmaceutical 

20 compositions containing a therapeutically effective amount of 
a BMP-9 protein in a pharmaceutically acceptable vehicle or 
carrier. BMP-9 compositions of the invention may be used in 
the formation of cartilage. These compositions may further be 
utilized for the formation of bone. BMP-9 compositions may 

25 also be used for wound healing and tissue repair. Compositions 
of the invention may further include at least one other 
therapeutically useful agent such as the BMP proteins BMP-1, 
BMP-2, BMP-3, BMP-4 r BMP-5, BMP-6, and BMP-7 disclosed for 
instance in PCT publications W088/00205, WO89/10409, and 

30 W09 0/113 66, and BMP-8, disclosed in U.S. application Ser. No. 

07/641,204 filed January 15, 1991, Ser. No. 07/525,357 filed 
May 16, 1990, and Ser. No. 07/800,364 filed November 20, 1991. 

The compositions of the invention may comprise, in 
addition to a BMP-9 protein, other therapeutically, useful 

35 agents including growth factors such as epidermal growth factor 
(EGF), fibroblast growth factor (FGF) , transforming growth 



SUBSTITUTE SHEET 



WO 93/00432 



PCT/US92/0S374 



3 

factor (TGF-cr and TGF-0) , and insulin-like growth factor (IGF) . 
The compositions may also include an appropriate matrix for 
instance, for supporting the composition and providing a 
surface for bone and/or cartilage growth. The matrix may 
5 provide slow release of the osteoinductive protein and/or the 
appropriate environment for presentation thereof. 

The BMP-9 compositions may be employed in methods for 
treating a number of bone and/ or cartilage defects, periodontal 
disease and various types of wounds. These methods, according 

10 to the invention, entail administering to a patient needing 
such bone and/ or cartilage formation wound healing or tissue 
repair, an effective amount of a BMP-9 protein. These methods 
may also entail the administration of a protein of the 
invention in conjunction with at least one of the novel BMP 

15 proteins disclosed in the co-owned applications described 
above. In addition, these methods may also include the 
administration of a BMP-9 protein with other growth factors 
including EGF, FGF, TGF-a, TGF-0, and IGF. 

Still a further aspect of the invention are DNA sequences 

20 coding for expression of a BMP-9, protein. Such sequences 
include the sequence of nucleotides in a 5' to 3' direction 
illustrated in Figure 1 (SEQ ID NO: 1) and Figure 3 (SEQ ID NO: 
8) or DNA sequences which hybridize under stringent conditions 
with the DNA sequences of Figure 1 or 3 and encode a protein 

25 having the ability to induce the formation of cartilage and/ or 
bone. Finally, allelic or other variations of the sequences of 
Figure 1 or 3, whether such nucleotide changes result in 
changes in the peptide sequence or not, are also included in 
the present invention. 

30 A further aspect of the invention includes vectors 

comprising a DNA sequence as described above in operative 
association with an expression control sequence therefor. 
These vectors may be employed in a novel process for producing 
a BMP-9 protein of the invention in which a cell line 

35 transformed with a DNA sequence encoding a BMP-9 protein in 
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operative association with an expression control sequence 
therefor, is cultured in a suitable culture medium and a BMP-9 
protein is recovered and purified therefrom. This process may 
employ a number of known cells both prokaryotic and eukaryotic 
5 as host cells for expression of the polypeptide. 

Other aspects and advantages of the present invention will 
be apparent upon consideration of the following detailed 
description and preferred embodiments thereof. 

10 Brief Description of the Drawing 

FIG. 1 comprises DNA sequence and derived amino acid sequence 
of murine BMP-9 from clone ML14a further described below. 

FIG. 2 comprises DNA sequence and derived amino acid sequence 
15 of human BMP-4 from lambda U20S-3 ATCC #40342. 

FIG. 3 comprises DNA sequence and derived amino acid sequence 
of human BMP-9 from X FIX/H6111 ATCC # 75252. 

20 Detailed Descripton of th e Invention 

The murine BMP-9 nucleotide sequence (SEQ ID NO: 1) and 
encoded amino acid sequence (SEQ ID NO: 2) are depicted in 
Figure 1. Purified murine BMP-9 proteins of the present 
invention are produced by culturing a host cell transformed wth 

25 a DNA sequence comprising the DNA coding sequence of Figure l 
(SEQ ID NO: 1) from nucleotide #610 to nucleotide #1893 and 
recovering and purifying from the culture medium a protein 
which contains the amino acid sequence or a substantially 
homologous sequence as represented by amino acid #319 to #428 

30 of Figure 1 (SEQ ID NO: 2) . The BMP-9 proteins recovered from 
the culture medium are purified by isolating them from other 
proteinaceous materials from which they are co-produced and 
from other contaminants present. 



35 
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Human BMP-9 nucleotide and amino acid sequence is depicted 
in SEQ ID No: 8 and 9, Mature human BMP-9 is expected to 
comprise amino acid #1 (Ser, Ala, Gly) to #110 (Arg) . 

Human BMP-9 may be produced by culturing a cell 
5 transformed with a DNA sequence comprising nucleotide #124 to 
#453 as shown in SEQ ID NO: 8 and recovering and purifying from 
the culture medium a protein characterized by the amino acid 
sequence of SEQ ID NO: 9 from amino acid #1 to amino acid #110 
substantially free from other proteinaceous materials with 

10 which it is co-produced, 

BMP-9 proteins may be characterized by the ability to 
induce the formation of cartilage. BMP-9 proteins may be 
further characterized by the ability to induce the formation of 
bone. BMP-9 proteins may be further characterized by the 

15 ability to demonstrate cartilage and/or bone formation activity 
in the rat bone formation assay described below. 

The BMP-9 proteins provided herein also include factors 
encoded by the sequences similar to those of Figure 1 and 3 
(SEQ ID NO's: 1 and 8) , but into which modifications are 

20 naturally provided (e.g. allelic variations in the nucleotide 
sequence which may result in amino acid changes in the 
polypeptide) or deliberately engineered. For example, 
synthetic polypeptides may wholly or partially duplicate 
continuous sequences of the amino acid residues of Figure 1 of 

25 Figure 3 (SEQ ID NO's: 2 and 9) . These sequences, by virtue of 
sharing primary, secondary, or tertiary structural and 
conformational characteristics with bone growth factor 
polypeptides of Figure 1 and Figure 3 may possess bone growth 
factor biological properties in common therewith. Thus, they 

30 may be employed as biologically active substitutes for 
naturally-occurring BMP-9 and other BMP-9 polypeptides in 
therapeutic processes. 

Other specific mutations of the sequences of BMP-9 
proteins described herein involve modifications of 

35 glycosylation sites. These modifications may involve O-linked 
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or N-linked glycosylation sites. For instance, the absence of 
glycosylation or only partial glycosylation results from amino 
acid substitution or deletion at asparagine- linked 
glycosylation recognition sites. The asparagine- linked 
5 glycosylation recognition sites comprise tripeptide sequences 
which are specifically recognized by appropriate cellular 
glycosylation enzymes. These tripeptide sequences are either 
asparagine-X- threonine or asparagine-X-serine, where X is 
usually any amino acid. A variety of amino acid substitutions 

10 or deletions at one or both of the first or third amino acid 
positions of a glycosylation recognition site (and/or amino 
acid deletion at the second position) results in non- 
glycosylation at the modified tripeptide sequence. 

The present invention also encompasses the novel DNA 

15 sequences, free of association with DNA sequences encoding 
other proteinaceous materials, and coding on expression for 
BMP-9 proteins. These DNA sequences include those depicted in 
Figure 1 or Figure 3 (SEQ ID NO's: 1 and 8) in a 5' to 3' 
direction and those sequences which hybridize thereto under 

20 stringent hybridization conditions [see, T. Maniatis et al, 
Molecular Cloning (A Laboratory Manual) , Cold Spring Harbor 
Laboratory (1982), pages 387 to 389] and encode a protein 
having cartilage and/ or bone inducing activity. 

Similarly, DNA sequences which code for BMP-9 proteins 

25 coded for by the sequences of Figure 1 or Figure 3, but which 
differ in codon sequence due to the degeneracies of the genetic 
code or allelic variations (naturally-occurring base changes in 
the species population which may or may not result in an amino 
acid change) also encode the novel factors described herein. 

30 Variations in the DNA sequences of Figure 1 or Figure 3 (SEQ ID 
NO: 1 and 8) which are caused by point mutations or by induced 
modifications (including insertion, deletion, and substitution) 
to enhance the activity, half -life or production ' of the 
polypeptides encoded are also encompassed in the invention. 

35 Another aspect of the present invention provides a novel 
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method for producing BMP-9 proteins. The method of the present 
invention involves culturing a suitable cell line, which has 
been transformed with a DNA sequence encoding a BMP-9 protein 
of the invention, under the control of known regulatory 
5 sequences. The transformed host cells are cultured and the 
BMP-9 proteins recovered and purified from the culture medium. 
The purified proteins are substantially free from other 
proteins with which they are co-produced as well as from other 
contaminants . 

10 Suitable cells or cell lines may be mammalian cells, such 

as Chinese hamster ovary cells (CHO) . The selection of 
suitable mammalian host cells and methods for transformation, 
culture, amplification, screening, product production and 
purification are known in the art. See, e.g., Gething and 

15 Sambrook, Nature , 293:620-625 (1981), or alternatively, Kaufman 
et al, Mol. Cell. Biol. , 5 (7) :1750-1759 (1985) or Howley et al, 
U.S. Patent 4,419,446. Another suitable mammalian cell line, 
which is described in the accompanying examples, is the monkey 
COS-l cell line. The mammalian cell CV-1 may also be suitable. 

20 Bacterial cells may also be sui table hosts. For example, 

the various strains of ]2. coli (e.g., HB101, MC1061) are 
well-known as host cells in the field of biotechnology. 
Various strains of B. subtilis . Pseudomonas , other bacilli and 
the like may also be employed in this method. 

25 

Many strains of yeast cells known to those skilled in the 
art may also be available as host cells for expression of the 
polypeptides of the present invention. Additionally, where 
desired, insect cells may be utilized as host cells in the 

30 method of the present invention. See, e.g. Miller et al, 
Genetic Engineering . 8:277-298 (Plenum Press 1986) and 
references cited therein. 

Another aspect of the present invention provides * vectors 
for use in the method of expression of these novel BMP-9 

35 polypeptides. Preferably the vectors contain the full novel 
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DNA sequences described above which encode the novel factors of 
the invention. Additionally the vectors also contain 
appropriate expression control sequences permitting expression 
of the BMP-9 protein sequences. Alternatively , vectors 
5 incorporating modified sequences as described above are also 
embodiments of the present invention. The vectors may be 
employed in the method of transforming cell lines and contain 
selected regulatory sequences in operative association with the 
DNA coding sequences of the invention which are capable of 

10 directing the replication and expression thereof in selected 
host cells. Regulatory sequences for such vectors are known to 
those skilled in the art and may be selected depending upon the 
host cells. Such selection is routine and does not form part 
of the present invention. 

15 A protein of the present invention , which induces 

cartilage and/ or bone formation in circumstances where bone is 
not normally formed, has application in the healing of bone 
fractures and cartilage defects in humans and other animals. 
Such a preparation employing a BMP-9 protein may have 

20 prophylactic use in closed as well, as open fracture reduction 
and also in the improved fixation of artificial joints. De 
novo bone formation induced by an osteogenic agent contributes 
to the repair of congenital, trauma induced, or oncologic 
resection induced craniofacial defects , and also is useful in 

25 cosmetic plastic surgery. A BMP-9 protein may be used in the 
treatment of periodontal disease, and in other tooth repair 
processes. Such agents may provide an environment to attract 
bone-forming cells, stimulate growth of bone-forming cells or 
induce differentiation of progenitors of bone-forming cells. 

30 BMP-9 polypeptides of the invention may also be useful in the 
treatment of osteoporosis. A variety of osteogenic, 
cartilage-inducing and bone inducing factors have been 
described. See, e.g. European patent applications 148; 155 and 
169,016 for discussions thereof. 

35 
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The proteins of the invention may also be used in wound 
healing and related tissue repair. The types of wounds 
include, but are not limited to burns r incisions and ulcers. 
(See, e.g. PCT Publication WO84/01106 for discussion of wound 
5 healing and related tissue repair) . 

It is further contemplated that proteins of the invention 
may increase neuronal survival and therefore be useful in 
transplantation and treatment of conditions exhibiting a 
decrease in neuronal survival. 

10 A further aspect of the invention is a therapeutic method 

and composition for repairing fractures and other conditions 
related to cartilage and/or bone defects or periodontal dis- 
eases. The invention further comprises therapeutic methods and 
compositions for wound healing and tissue repair. Such 

15 compositions comprise a therapeutically effective amount of at 
least one of the BMP-9 proteins of the invention in admixture 
with a pharmaceutical^ acceptable vehicle, carrier or matrix. 

It is expected that the proteins of the invention may act 
in concert with or perhaps synergistically with other related 

20 proteins and growth factors. Further therapeutic methods and 
compositions of the invention therefore comprise a therapeutic 
amount of at least one BMP-9 protein of the invention with a 
therapeutic amount of at least one of the other BMP proteins 
disclosed in co-owned applications described above. Such 

25 combinations may comprise separate molecules of the BMP 
proteins or heteromolecules comprised of different BMP 
moieties. For example , a method and composition of the 
invention may comprise a disulfide linked dimer comprising a 
BMP-9 protein subunit and a subunit from one of the "BMP" 

30 proteins described above. A further embodiment may comprise a 
heterodimer of BMP-9 moieties. Further, BMP-9 proteins may be 
combined with other agents beneficial to the treatment of the 
bone and/or cartilage defect, wound, or tissue in question. 
These agents include various growth factors such as epidermal 

35 growth factor (EGF) , platelet derived growth factor (PDGF) , 
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transforming growth factors (TGF-a and TGF-0) , and insulin-like 
growth factor (IGF) . 

The preparation and formulation of such physiologically 
acceptable protein compositions, having due regard to pH, 
5 isotonicity, stability and the like, is within the skill of the 
art. The therapeutic compositions are also presently valuable 
for veterinary applications due to the lack of species 
specificity in BMP proteins. Particularly domestic animals and 
thoroughbred horses in addition to humans are desired patients 

10 for such treatment with BMP-9 of the present invention. 

The therapeutic method includes administering the 
composition topically, systemically, or locally as an implant 
or device. When administered, the therapeutic composition for 
use in this invention is, of course, in a pyrogen-free, 

15 physiologically acceptable form. Further, the composition may 
desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. 
Topical administration may be suitable for wound healing and 
tissue repair. Therapeutically useful agents other than the 

20 BMP-9 proteins which may also optionally be included in the 
composition as described above, may alternatively or 
additionally, be administered simultaneously or sequentially 
with the BMP composition in the methods of the invention. 
Preferably for bone and/or cartilage formation, the 

25 composition would include a matrix capable of delivering BMP-9 
or other BMP proteins to the site of bone and/or cartilage 
damage, providing a structure for the developing bone and 
cartilage and optimally capable of being resorbed into the 
body. The matrix may provide slow release of BMP-9 and/or the 

30 appropriate environment for presentation thereof. Such 
matrices may be formed of materials presently in use for other 
implanted medical applications. 

The choice of matrix material is based on 
biocompatibility , biodegradability , mechanical properties , 

35 cosmetic appearance and interface properties. The particular 
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application of the BMP-9 compositions will define the 
appropriate formulation. Potential matrices for the 
compositions may be biodegradable and chemically defined 
calcium sulfate, tricalciumphosphate, hydroxyapatite, 
5 polylactic acid and polyanhydr ides . Other potential materials 
are biodegradable and biologically well defined, such as bone 
or dermal collagen. Further matrices are comprised of pure 
proteins or extracellular matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as 

10 sintered hydroxyapatite, bioglass, aluminates, or other 
ceramics. Matrices may be comprised of combinations of any of 
the above mentioned types of material, such as polylactic acid 
and hydroxyapatite or collagen and tricalciumphosphate. The 
bioceramics may be altered in composition, such as in calcium- 

15 aluroinate-phosphate and processing to alter pore size, particle 
size, particle shape, and biodegradability . 

The dosage regimen will be determined by the attending 
physician considering various factors which modify the action 
of the BMP-9 protein, e.g. amount of bone weight desired to be 

20 formed, the site of bone damage, the condition of the damaged 
bone, the size of a wound, type of damaged tissue, the 
patient's age, sex, and diet, the severity of any infection, 
time of administration and other clinical factors. The dosage 
may vary with the type of matrix used in the reconstitution and 

25 the types of BMP proteins in the composition. The addition of 
other known growth factors, such as IGF I (insulin like growth 
factor I) , to the final composition, may also effect the 
dosage. Progress can be monitored by periodic assessment of 
bone growth and/or repair, for example, x-rays, 

30 histomorphometric determinations and tetracycline labeling. 

The following examples illustrate practice of the present 
invention in recovering and characterizing murine BMP-9 protein 
and employing it to recover the human and other BMP-9 proteins, 
obtaining the human proteins and expressing the proteins via 

35 recombinant techniques. 
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EXAMPLE I 
Murine BMP-9 

750,000 recombinants of a mouse liver cDNA library made in 
the vector lambdaZAP (Stratagene/Catalog #935302) are plated 
5 and duplicate nitrocellulose replicas made. A fragment of 
human BMP-4 DNA corresponding to nucleotides 1330-1627 of 
Figure 2 (SEQ ID NO: 3) (the human BMP-4 sequence) is ^P- 
labeled by the random priming procedure of Feinberg et al. 
[Anal. Biochem. 132: 6-13 (1983)] and hybridized to both sets 

10 of filters in SHB at 60°C for 2 to 3 days. Both sets of filters 
are washed under reduced stringency conditions (4X SSC, 0.1% 
SDS at 60°C) . Many duplicate hybridizing recombinants of 
various intensities (approximately 92) are noted. 50 of the 
strongest hybridizing recombinant bacteriophage are plaque 

15 purified and their inserts are transferred to the plasmid 
Bluescript SK (+/-) according to the in vivo excision protocol 
described by the manufacturer (Stratagene) . DNA sequence 
analysis of several recombinants indicate that they encode a 
protein homologous to other BMP proteins and other proteins in 

20 the TGF-/3 family. The DNA sequence and derived amino acid 
sequence of one recombinant , designated ML14a, is set forth in 
Figure 1. (SEQ ID NO: 1) 

The nucleotide sequence of clone ML14a contains an open 
reading frame of 1284 bp, encoding a BMP-9 protein of 428 amino 

25 acids. The encoded 428 amino acid BMP-9 protein is 
contemplated to be the primary translation product as the 
coding sequence is preceded by 609 bp of 5' untranslated 
sequence with stop codons in all three reading frames. The 428 
amino acid sequence predicts a BMP-9 protein with a molecular 

30 weight of 48,000 daltons. 

Based on knowledge of other BMP proteins and other 
proteins within the TGF-0 family, it is predicted that the 
precursor polypeptide would be cleaved at the muitibasic 
sequence ARG-ARG-LYS-ARG in agreement with a proposed consensus 
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proteolytic processing sequence of ARG-X-X-ARG. Cleavage of 
the BMP-9 precursor polypeptide at this location would generate 
a 110 amino acid mature peptide beginning with the amino acid 
SER at position #319. The processing of BMP-9 into the mature 
5 form is expected to involve dimerization and removal of the N- 
terminal region in a manner analogous to the processing of the 
related protein TGF-0 [L.E. Gentry, et al., Molec. & Cell. 
Bio1 ' 8:4162 (1988); R. Derynck, et al., Nature 316:701 
(1985)]. 

10 It is contemplated therefore that the mature active 

species of murine BMP-9 comprises a homodimer of 2 polypeptide 
subunits, each subunit comprising amino acids #319-#428 with a 
predicted molecular weight of approximately 12,000 daltons. 
Further active species are contemplated comprising amino acids 

15 #326 - #428 thereby including the first conserved cysteine 
residue. As with other members of the BMP and TGF-0 family of 
proteins, the car boxy- terminal region of the BMP-9 protein 
exhibits greater sequence conservation than the more amino- 
terminal portion. The percent amino acid identity of the 

20 murine BMP-9 protein in the cysteine-rich C-terminal domain 
(amino acids #326 - #428) to the corresponding region of other 
human BMP proteins and other proteins within the TGF-/3 family 
is as follows: BMP-2, 53%; BMP-3, 43%; BMP-4 , 53%; BMP-5, 55%; 
BMP-6, 55%; BMP-7, 53%; Vgl, 50%; GDF-1 , 43%; TGF-jSl, 32%; TGF- 

25 02, 34%; TGF-03, 34%; inhibin 0(B) , 34%; and inhibin 0 (A) , 42%. 

EXAMPLE II 
Human BMP-9 

Murine and human osteoinductive factor genes are presumed 
30 to be significantly homologous, therefore the murine coding 
sequence or a portion thereof is used as a probe to screen a 
human genomic library or as a probe to identify a human cell 
line or tissue which synthesizes the analogous human cartilage 
and/or bone protein. A human genomic library (Toole et al., 
35 supra ) may be screened with such a probe, and presumptive 
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positives isolated and DNA sequence obtained. Evidence that 
this recombinant encodes a portion of the human BMP-9 relies of 
the murine/human protein and gene structure homologies. 

Once a recombinant bacteriophage containing DNA encoding 
5 portion of the human cartilage and/ or bone inductive factor 
molecule is obtained, the human coding sequence can be used as 
a probe to identify a human cell line or tissue which 
synthesizes BMP-9. Alternatively, the murine coding sequence 
can be used as a probe to identify such human cell line or 

10 tissue. Briefly described, RNA is extracted from a selected 
cell or tissue source and either electrophoresed on a 
formaldehyde agarose gel and transferred to nitrocellulose, or 
reacted with formaldehyde and spotted on nitrocellulose 
directly. The nitrocellulose is then hybridized to a probe 

15 derived from a coding sequence of the murine or human BMP-9. 
mRNA is selected by oligo (dT) cellulose chromatography and 
cDNA is synthesized and cloned in lambda gtlO or lambda ZAP by 
established techniques (Toole et al. , supra) . 

Additional methods known to those skilled in the art may 

20 be used to isolate the human and otfcer species' BMP-9 proteins 
of the invention. 

A. Isolation of Human BMP-9 DNA 

One million recombinants of a human genomic library 
25 constructed in the vector XFIX (Stratagene catalog # 944201) 
are plated and duplicate nitrocellulose replicas made. Two 
oligonucleotides probes designed on the basis of nucleotides 
#1665-#1704 and #1837-#1876 of the sequence set forth in Figure 
1 (SEQ ID N0:1) are synthesized on an automated DNA 
30 synthesizer. The sequence of these two oligonucleotides is 
indicated below: 

#1 : CTATGAGTGTAAAGGGGGTTGCTTCTTCCCATTGGCTGAT 
#2 : GTGCCAACCCTCAAGTACCACTATGAGGGGATGAGTGTGG 
These two oligonucleotide probes are radioactively labeled with 
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T^P-ATP and each is hybridized to one set of the duplicate 
nitrocellulose replicas in SHB at 65°C and washed with IX SSC, 
0.1% SDS at 65°C. Three recombinants which hybridize to both 
oligonucleotide probes are noted. All three positively 
5 hybridizing recombinants are plague purified, bacteriophage 
plate stocks are prepared and bacteriophage DNA is isolated 
from each. The oligonucleotide hybridizing regions of one of 
these recombinants, designated HG111, is localized to a 1.2 kb 
Pst I/Xba I fragment. This fragment is subcloned into a 

10 plasmid vector (pGEM-3) and DNA sequence analysis is performed. 
HG111 was deposited with the ATCC, 12301 Parklawn Drive, 
Rockville, Maryland USA on June 16, 1992 under the requirements 
of the Budapest Treaty and designated as ATCC # 75252. This 
subclone is designated pGEM-111. A portion of the DNA sequence 

15 of clone pGEM-111 is set forth in Figure 3 (SEQ ID NO: 8/ HUMAN 
BMP-9 sequence) . This sequence encodes the entire mature 
region of human BMP-9 and a portion of the propeptide. It 
should be noted that this sequence consists of preliminary 
data. Particularly, the propeptide region is subject to 

20 further analysis and characterization. For example, 
nucleotides #l through #3 (TGA) encode a translational stop 
which may be incorrect due to the preliminary nature of the 
sequence. It is predicted that additional sequences present in 
both pGEM-111 (the 1.2 kb Pstl/Xbal fragment of HG111 subcloned 

25 into pGEM) and HG111 encode additional amino acids of the human 
BMP-9 propeptide region. Based on knowledge of other BMPs and 
other proteins within the TGF-0 family, it is predicted that 
the precursor polypeptide would be cleaved at the multibasic 
sequence ARG-ARG-LYS-ARG (amino acids # -4 through # -1 of 

30 SEQUENCE ID NO: 9) in agreement with a proposed consensus 
proteolytic processing sequence ARG-X-X-ARG. Cleavage of the 
human BMP-9 precursor polypeptide at this location would 
generate a 110 amino acid mature peptide beginning with the 
amino acid SER at position #1 of SEQUENCE ID NO: 9 (encoded by 
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nucleotides #124 through #126 of SEQUENCE ID NO: 8). The 
processing of human BMP-9 into the mature form is expected to 
involve dimerization and removal of the N-terminal region in a 
manner analogous to the processing of the related protein TGF-/S 
5 [L.E. Gentry, et al., Molec. & Cell- Biol. 8:4162 (1988); 
Derynck, et al., Nature 316:701 (1985)]. 

It is contemplated therefore that the mature active 
species of human BMP-9 comprises a homodimer of two polypeptide 
subunits, each subunit comprising amino acids #1 through #110 

10 of SEQUENCE ID NO: 9, with a predicted molecular weight of 
12,000 daltons. Further active species are contemplated 
comprising amino acids #8 through #110 thereby including the 
first conserved cysteine residue. As with other members of the 
BMP and TGF-/J family of proteins, the carboxy-terminal portion 

15 of the human BMP-9 sequence exhibits greater sequence 
conservation than the amino-terminal portion. the percent 
amino acid identity of the human BMP-9 protein in the cysteine- 
rich C-terminal domain (amino acids #8 through #110) to the 
corresponding region of other human BMP proteins and other 

20 proteins within the TGF-/? family is as follows: BMP-2, 52%; 
BMP-3, 40%; BMP-4, 52%; BMP-5, 55%; BMP-6, 55%; BMP-7, 53%; 
murine BMP-9 , 97%; Vgl, 50%; GDF-1, 44%; TGF-01, 32%; TGF-02, 
32%; TGF-03, 32%; inhibin 0 (B) , 35%; and inhibin 0 (A), 41%. 

25 EXAMPLE III 

Rosen Modified Sampath-Reddi Assay 

A modified version of the rat bone formation assay 
described in Sampath and Reddi, Proc. Natl. Acad . Sci. U.S.A., 
80:6591-6595 (1983) is used to evaluate bone and/or cartilage 

30 activity of the BMP proteins. This modified assay is herein 
called the Rosen-modified Sampath-Reddi assay. The ethanol 
precipitation step of the Sampath-Reddi procedure is replaced 
by dialyzing (if the composition is a solution) or diafiltering 
(if the composition is a suspension) the fraction to be assayed 

35 against water. The solution or suspension is then redissolved 
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in 0.1 % TFA, and the resulting solution added to 20mg of rat 
matrix. A mock rat matrix sample not treated with the protein 
serves as a control. This material is frozen and lyophilized 
and the resulting powder enclosed in #5 gelatin capsules. The 
5 capsules are implanted subcutaneous ly in the abdominal thoracic 
area of 21 - 49 day old male Long Evans rats. The implants are 
removed after 7-14 days. Half of each implant is used for 
alkaline phosphatase analysis [See, A. H. Reddi et al., 
Proc. Natl Acad Sci . r 69:1601 (1972)]. 

10 T he other half of each implant is fixed and processed for 

histological analysis. i,an glycolmethacrylate sections are 
stained with Von Kossa and acid fuschin to score the amount of 
induced bone and cartilage formation present in each implant. 
The terms +1 through +5 represent the area of each 

15 histological section of an implant occupied by new bone and/or 
cartilage cells and matrix. A score of +5 indicates that 
greater than 50% of the implant is new bone and/or cartilage 
produced as a direct result of protein in the implant. A score 
of +4, +3, +2 and +1 would indicate that greater than 40%, 

20 30%, 20% and 10% respectively of, the implant contains new 
cartilage and/or bone. In a modified scoring method, three 
non-adjacent sections are evaluated from each implant and 
averaged. "+/-•• indicates tentative identification of 
cartilage or bone; »+i» indicates >10% of each section being 

25 new cartilage or bone; "+2", >25%; "+3", >50%; "+4 11 , -75%; 
"+5«, >80%. A »-» indicates that the implant is not recovered. 

It is contemplated that the dose response nature of the 
BMP-9 containing samples of the matrix samples will demonstrate 
that the amount of bone and/or cartilage formed increases with 

30 the amount of BMP-9 in the sample. It is contemplated that the 
control samples will not result in any bone and/or cartilage 
formation. 

As with other cartilage and/or bone inductive proteins 
such as the above-mentioned "BMP" proteins, the bone and/or 
35 cartilage formed is expected to be physically confined to the 
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space occupied by the matrix. Samples are also analyzed by SDS 
gel electrophoresis and isoelectric focusing followed by 
autoradiography. The activity is correlated with the protein 
bands and pi. To estimate the purity of the protein in a 
5 particular fraction an extinction coefficient of 1 OD/mg-cm is 
used as an estimate for protein and the protein is run on SDS 
PAGE followed by silver staining or radioiodination and 
autoradiography . 

10 EXAMPLE IV 

Expression of BMP-9 

In order to produce murine , human or other mammalian BMP-9 

proteins, the DNA encoding it is transferred into an 

appropriate expression vector and introduced into mammalian 
15 cells or other preferred eukaryotic or prokaryotic hosts by 

conventional genetic engineering techniques. The preferred 

expression system for biologically active recombinant human 

BMP-9 is contemplated to be stably transformed mammalian cells. 

One skilled in the art can construct mammalian expression 
20 vectors by employing the sequence of Figure 1 (SEQ ID NO: 1) or 
Figure 3 (SEQ ID NO: 8) , or other DNA sequences encoding BMP-9 
proteins or other modified sequences and known vectors, such as 
pCD [Okayama et al., Mol. Cell Biol. , 2:161-170 (1982)], pJL3, 
pJL4 [Gough et al., EMBO J\, 4:645-653 (1985)] and pMT2 CXM. 
25 The mammalian expression vector pMT2 CXM is a derivative 

of p91023 (b) (Wong et al. , Science £28:810-815, 1985) 
differing from the latter in that it contains the ampicillin 
resistance gene in place of the tetracycline resistance gene 
and further contains a Xhol site for insertion of cDNA clones. 
30 The functional elements of pMT2 CXM have been described 
(Kaufman, R.J., 1985, Proc. Natl. Acad. Sci. USA 82:689-693) 
and include the adenovirus VA genes, the SV40 origin of 
replication including the 72 bp enhancer, the adenovirus major 
late promoter including a 5' splice site and the majority of 
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the adenovirus tripartite leader sequence present on adenovirus 
late mRNAs, a 3' splice acceptor site, a DHFR insert, the SV40 
early polyadenylation site (SV40) , and pBR322 sequences needed 
for propagation in Ea. coli . 
5 Plasmid pMT2 CXM is obtained by EcoRI digestion of pMT2- 

VWF, which has been deposited with the American Type Culture 
Collection (ATCC) , Rockville, MD (USA) under accession number 
ATCC 67122. EcoRI digestion excises the cDNA insert present in 
PMT2-VWF, yielding pMT2 in linear form which can be ligated and 

10 used to transform E. coli HB 101 or DH-5 to ampicillin 
resistance. Plasmid pMT2 DNA can be prepared by conventional 
methods. pMT2 CXM is then constructed using loopout/in 
mutagenesis [Morinaga, et al., Biotechnology 84 ; 636 (1984). 
This removes bases 1075 to 1145 relative to the Hind III site 

15 near the SV40 origin of replication and enhancer sequences of 
pMT2. In addition it inserts the following sequence: 

5/ PO-CATGGGCAGCTCGAG-3 ' (SEQ ID NO: 5) 
at nucleotide 1145. This sequence contains the recognition 
site for the restriction endonuclease Xho I. A derivative of 

20 pMT2CXM, termed pMT23, contains recognition sites for the 
restriction endonucleases PstI, Eco RI, Sail and Xhol. Plasmid 
pMT2 CXM and pMT23 DNA may be prepared by conventional methods. 

pEMC2bl derived from pMT21 may also be suitable in 
practice of the invention. pMT21 is derived from pMT2 which is 

25 derived from pMT2-VWF. As described above EcoRI digestion 
excises the cDNA insert present in pMT-VWF, yielding pMT2 in 
linear form which can be ligated and used to transform E^ Coli 
HB 101 or DH-5 to ampicillin resistance. Plasmid pMT2 DNA can 
be prepared by conventional methods. 

30 pMT21 is derived from pMT2 through the following two 

modifications. First, 76 bp of the 5' untranslated region of 
the DHFR cDNA including a stretch of 19 G residues from G/C 
tailing for cDNA cloning is deleted. In this process/ a Xhol 
site is inserted to obtain the following sequence immediately 
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upstream from DHFR: 5' -CTGCAG GCGAGCC TGAATTCCTCGAGC CAT CATG -3 ' 

PstI Eco RI Xhol 

(SEQ ID NO: 6) 

Second, a unique Clal site is introduced by digestion with 
5 EcoRV and Xbal, treatment with Klenow fragment of DNA 
polymerase I, and ligation to a Clal linker (CATCGATG) . This 
deletes a 250 bp segment from the adenovirus associated RNA 
(VAI) region but does not interfere with VAI RNA gene 
expression or function. pMT21 is digested with EcoRI and Xhol, 

10 and used to derive the vector pEMC2Bl. 

A portion of the EMCV leader is obtained from pMT2-ECATl 
[S.K. Jung, et al, J. Virol 63:1651-1660 (1989)] by digestion 
with. Eco RI and Pstl f resulting in a 2752 bp fragment. This 
fragment is digested with TaqI yielding an Eco RI-TaqI fragment 

15 of 508 bp which is purified by electrophoresis on low melting 
agarose gel. A 68 bp adapter and its complementary strand are 
synthesized with a 5' TaqI protruding end and a 3' Xhol 
protruding end which has the following sequence: 

20 5 ' -CGAGGTTAAAAAACGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTT 
TaqI 

GAAAAACACGATTGC-3 ' 

Xhol (SEQ ID NO: 7) 

25 

This sequence matches the EMC virus leader sequence from 
nucleotide 763 to 827. It also changes the ATG at position 10 
within the EMC virus leader to an ATT and is followed by a Xhol 
site. A three way ligation of the pMT21 Eco Rl-Xhol fragment, 
30 the EMC virus EcoRI-TaqI fragment, and the 68 bp 

oligonucleotide adapter Taql-Xhol adapter resulting in the 
vector pEMC2/3l. 

This vector contains the SV40 origin of replication and 
35 enhancer, the adenovirus major late promoter, a cDNA copy of 
the majority of the adenovirus tripartite leader sequence, a 
small hybrid intervening sequence, an SV40 polyadenylation 
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signal and the adenovirus VA I gene, DHFR and ^-lactamase 
markers and an EMC sequence, in appropriate relationships to 
direct the high level expression of the desired cDNA in 
mammalian cells. 

5 The construction of vectors may involve modification of 

the BMP-9 DNA sequences. For instance, BMP-9 cDNA can be 
modified by removing the non-coding nucleotides on the 5' and 
3' ends of the coding region. The deleted non-coding 
nucleotides may or may not be replaced by other sequences known 

10 to be beneficial for expression. These vectors are transformed 
into appropriate host cells for expression of BMP-9 proteins. 

One skilled in the art can manipulate the sequences of 
Figure 1 or Figure 3 (SEQ ID NO: 1 and 8) by eliminating or 
replacing the mammalian regulatory sequences flanking the 

15 coding sequence with bacterial sequences to create bacterial 
vectors for intracellular or extracellular expression by 
bacterial cells. For example, the coding sequences could be 
further manipulated (e.g. ligated to other known linkers or 
modified by deleting non-coding sequences therefrom or altering 

20 nucleotides therein by other known . techniques ) . The modified 
BMP-9 coding sequence could then be inserted into a known 
bacterial vector using procedures such as described in T. 
Taniguchi et al., Proc. Natl Acad. Sci. USA . 77:5230-5233 
(1980). This exemplary bacterial vector could then be 

25 transformed into bacterial host cells and a BMP-9 protein 
expressed thereby- For a strategy for producing extracellular 
expression of BMP-9 proteins in bacterial cells, see, e.g. 
European patent application EPA 177,343. 

Similar manipulations can be performed for the 

30 construction of an insect vector [See, e.g. procedures 
described in published European patent application 155,476] for 
expression in insect cells. A yeast vector could also be 
constructed employing yeast regulatory sequences for 
intracellular or extracellular expression of the factors of the 

35 present invention by yeast cells. [See, e.g., procedures 
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described in published PCT application WO86/00639 and European 
patent application EPA 123,289]. 

A method for producing high levels of a BMP-9 protein of 
the invention in mammalian cells may involve the construction 
5 of cells containing multiple copies of the heterologous BMP-9 
gene* The heterologous gene is linked to an amplifiable 
marker, e.g. the dihydrofolate reductase (DHFR) gene for which 
cells containing increased gene copies can be selected for 
propagation in increasing concentrations of methotrexate (MTX) 

10 according to the procedures of Kaufman and Sharp, J. Mol. 
Biol. . 159:601-629 (1982) . This approach can be employed with 
a number of different cell types. 

For example, a plasmid containing a DNA sequence for a 
BMP-9 of the invention in operative association with other 

15 plasmid sequences enabling expression thereof and the DHFR 
expression plasmid pAdA26SV(A)3 [Kaufman and Sharp, Mol. Cell. 
Biol. . 2:1304 (1982)] can be co-introduced into DHFR-def icient 
CHO cells, DUKX-BII, by various methods including calcium 
phosphate coprecipitation and transfection, electroporation or 

20 protoplast fusion. DHFR expressing transformants are selected 
for growth in alpha media with dialyzed fetal calf serum, and 
subsequently selected for amplification by growth in increasing 
concentrations of MTX (e.g. sequential steps in 0.02, 0.2, 1.0 
and 5uM MTX) as described in Kaufman et al. , Mol Cell Biol., 

25 5:1750 (1983). Transformants are cloned, and biologically 
active BMP-9 expression is monitored by the Rosen-modified 
Sampath - Reddi rat bone formation assay described above in 
Example III. BMP-9 expression should increase with increasing 
levels of MTX resistance. BMP-9 polypeptides are characterized 

30 using standard techniques known in the art such as pulse 
labeling with [35S] methionine or cysteine and poly aery lamide 
gel electrophoresis. Similar procedures can be followed to 
produce other related BMP-9 proteins. 

35 
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A. BMP-9 Vector Construction 

In order to produce human BMP-9 proteins of the invention 
DNA sequences encoding the mature region of the human BMP-9 
protein may be joined to DNA sequences encoding the propeptide 
5 region of the murine BMP-9 protein. This murine/human hybrid 
DNA sequence is inserted into an appropriate expression vector 
and introduced into mammalian cells or other preferred 
eukaryotic or prokaryotic hosts by conventional genetic 
engineering techniques. The construction of this murine/human 

10 BMP-9 containing expression plasmid is described below. 

A derivative of the human BMP-9 sequence (SEQ ID NO: 8) 
comprising the nucleotide sequence from nucleotide #105 to #470 
is specifically amplified. The following oligonucleotides are 
utilized as primers to allow the amplification of nucleotides 

15 #105 to #470 of the human BMP-9 sequence (SEQ ID NO: 8) from 
clone pGEM-111 described above. 

#3 ATCGGGCCCCTTTTAGCCAGGCGGAAAAGGAG 
#4 AGCGAATTCCCCGCAGGCAGATACTACCTG 
This procedure generates the insertion of the nucleotide 

20 sequence ATCGGGCCCCT immediately preceeding nucleotide #105 and 
the insertion of the nucleotide sequence GAATTCGCT immediately 
following nucleotide #470. The addition of these sequences 
results in the creation of an Apa I and EcoR I restriction 
endonuclease site at the respective ends of the specifically 

25 amplified DNA fragment. The resulting 374 bp Apa I/EcoR I 
fragment is subcloned into the plasmid vector pGEM-7Zf(+) 
(Pr omega catalog# p2251) which has been digested with Apa I and 
EcoR I. The resulting clone is designated phBMP9mex-l. 

30 The following oligonucleotides are designed on the basis 

of murine BMP-9 sequences (SEQ ID NO:l) and are modified to 
facilitate the construction of the murine/human expression 
plasmid referred to above: 
#5 

35 GATTCCGTCGACCACCATGTCCCCTGGGGCCTGGTCTAGATGGATACACAGCTGTGGGGCC 
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# 6 CCACAGCTGTGTATCCATCTAGACCAGGCCCCAGGGGACATGGTGGTCGACG 

These oligonucleotides contain complimentary sequences which 

upon addition to each other facilitate the annealing (base 

pairing) of the two individual sequences, resulting in the 

5 formation of a double stranded synthetic DNA linker (designated 

LINK-1) in a manner indicated below: 

1 5 10 20 30 40 50 60 

i i i j t ! j ! 

#5GATTCCGTCGACCACCATGTCCCCTGGGGCCTGGTCTAGATGGATACACAGOTGTGGGGCC 
10 GCAGCTGGTGGTACAGGGGACCCCGGACCAGATCTACCTATGTGTCGACACC #6 

This DNA linker (LINK-1) contains recognition sequences of 

restriction endonucleases needed to facilitate subsequent 

manipulations required to construct the murine/human expression 

plasmid, as well as sequences required for maximal expression 

15 of heterologous sequences in mammalian cell expression systems. 
More specifically (referring to the - sequence numbering of 
oligonucleotide #5/LINK-l) : nucleotides #1-#11 comprise 
recognition sequences for the restriction endonucleases BamH I 
and Sal I, nucleotides #11-#15 allow for maximal expression of 

20 heterologuos sequences in mammallian cell expression systems, 
nucleotides #16-#31 correspond to nucleotides #610-#625 of the 
murine BMP-9 sequence (SEQ ID NO:l), nucleotides #32-#33 are 
inserted to facilitate efficient restriction digestion of two 
adjacent restriction endonuclease sites (Eco0109 I and Xba I) , 

25 nucleotides #34-#60 correspond to nucleotides #1515-#1541 of 
the murine BMP-9 sequence (SEQ ID NO:l) except that nucleotide 
#58 of synthetic oligonucloetide #5 is a G rather than the A 
which appears at position #1539 of SEQ ID NO:l (This nucleotide 
conversion results in the creation of an Apa I restriction 
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endonuclease recognition sequence, without altering the amino 
acid sequence it is intended to encode, to facilitate further 
manipulations of the murine/human hybrid expression plasmid. 
LINK-1 (the double stranded product of the annealing of 
oligonucleotides #5 and #6) is subcloned into the plasmid 
vector pGEM-7Zf (+) which has been digested with the restriction 
endonucleases Apa I and BamH I. This results in a plasmid in 
which the sequences normally present between the Apa I and BamH 
I sites of the pGEM-7Zf (+) plasmid polylinker are replaced with 
the sequences of LINK-1 described above. The resulting plasmid 
clone is designated pBMP-91ink. 

pBMP-9link is digested with the restriction endonucleases 
BamH I and Xba I resulting in the removal nucleotides #l-#34 of 
LINK-l (refer to the numbering of oligo #5) • Clone ML14a, 
which contains an insert comprising the sequence set forth in 
SEQ ID N0:l, is also digested with the restriction 
endonucleases BamH I and Xba I resulting in the removal of 
sequences comprising nucloetides #1-#1515 of SEQUENCE ID NO:l 
(murine BMP-9) . This BamH I/Xba I fragment of mouse BMP-9 is 
isolated from the remainder of the ML14a plasmid clone and 
subcloned into the BamH I/Xba I sites generated by the removal 
of the synthetic linker sequences described above. The 
resulting clone is designated p302. 

The p302 clone is digested with the restriction 
endonuclease EcoO109 I resulting in the excision of nucloetides 
corresponding to nucleotides #621-#1515 of the murine BMP-9 
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sequence (SEQ ID NO:l) and nucleotides #35-#59 of LINK-1 (refer 
to numbering of oligonucleotide #5) . It should be noted that 
the Apa I restriction site created in LINK-1 by the A to G 
conversion described above is a subset of the recognition 
5 sequence of EcoO109 I, therefore digestion of p302 with Eco0109 
I cleaves at the Apa I site as well as the naturally occuring 
murine Eco0109 I (location #619-#625 of SEQ ID NO:l) resulting 
in the excision of a 920 bp Eco0109 I/EcoO109 I (Apa I) 
fragment comprising the sequences described above. This 920 

10 EcoO109 I/EcoO109 I (Apa I) fragment is isolated from the 
remainder of the p302 plasmid clone and subcloned into clone 
pBMP-91ink which has been similarly digested with EcoO109 I. 
It should be noted that the nucleotides GG (#32-#33 of 
oligonucleotide #5) originally designed to facilitate a more 

15 complete digestion of the two adjacent restriction sites 
Eco0109 I and Xba I of LINK-1, which is now a part of pBMP- 
91ink (described above) , results in the creation of Dcm 
methylation recognition sequence. The restriction nuclease 
Eco0109 I is sensitive to Dcm methylation and therefore 

20 cleavage of this sequence (nucleotides #25-#31 of 
oligonucleotide #5/LINK-l) by the restriction endonuclease 
Eco0109 I is prevented at this site. Therefore the plasmid 
clone pBMP-91ink is cleaved at the Apa I site but not at the 
EcoO109 I site upon digestion with the restriction endonuclease 

25 EcoO109 I as described above, preventing the intended* removal 
of the sequences between the EcoO109 I and Xba I site of LINK-1 
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(#32-#55 defined by the numbering of oligonucleotide #5) . This 
results in the insertion of the 920 bp Eco0109 I/Apa I fragment 
at the Eco0109 I (Apa I) site of pBMP-9link, The resulting 
clone is designated p318. 
5 Clone p318 is digested with the restriction endonucleases 

Sal I and Apa I, resulting in the excision of sequences 
comprising nucleotides #6-#56 of LINK-1 (refer to oligo #5 for 
location), nucleotides #621-#1515 of murine BMP-9 (SEQ ID 
NO:l) , and nucleotides #35-#60 of LINK-1 (refer to oligo #5 for 

10 location) . The resulting 972 bp Sal I/Apa I fragment described 
above is isolated from the remainder of the p318 plasmid clone 
and will be utilized in subsequent manipulations. 

The clone phBMP9mex-l (described above) , which contains 
DNA sequences which encode the entire mature region and 

15 portions of the propeptide of the human BMP-9 protein, is 
digested with the restriction endonucleases Apa I and EcoR I. 
This results in the excision of a 374 bp fragment comprising 
nucleotides #105-#470 of the human BMP-9 sequence (SEQ ID NO: 8) 
and the additional nucleotides of oligonucleotide primers #3 

20 and #4 which contain the recognition sequences for the 
restriction endonucleases Apa I and EcoR I. This 374 bp Apa 
I/EcoR I fragment is combined with the 972 bp Sal I/Apa I 
fragment from pl38 (isolation described above) and ligated to 
the mammalian cell expression plasmid pED6 (a derivative of 

25 pEMC201) which has been digested with Sal I and EcoR I. The 
resulting clone is designated p324. 
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The clone MLl4a (murine BMP-9) is digested with Eco0109 I 
and Xba I to generate a fragment comprising nucleotides #621- 
#1515 of SEQ ID NO:l. 

The following oligonucleotides are synthesized on an 
5 automated DNA synthesizer and combined such that their 
complimentary sequences can base pair (anneal) with each other 
to generate a double stranded synthetic DNA linker designated 
LINK-2 : 

#7 TCGACCACCATGTCCCCTGG 

10 #8 GCCCCAGGGGACATGGTGG 

This double stranded synthetic DNA linker (LINK-2) anneals in 

such a way that it generates single stranded ends which are 

compatible to DNA fragments digested with Sal I (one end) or 

EcoO109 I (the other end) as indicated below: 

15 #7 TCGACCACCATGTCCCCTGG 

GGTGGTACAGGGGACCCCG #8* 

This LINK-2 synthetic DNA linker is ligated to the 895 bp 

Eco0109 I/Xba I fragment comprising nucleotides #621-#1515 of 

murine BMP-9 (SEQ ID N0:1) described above. This results in a 

20 915 bp Sal I/Xba I fragment. 

The clone p324 is digested with Sal I/Xba I to remove 
sequences comprising nucleotides #6-#56 of LINK-1 (refer to 
oligo #5 for location) and nucleotides #621-#1515 of murine 
BMP-9 (SEQ ID N0:1) . The sequences comprising nucleotides #35- 

25 #60 of LINK-1 (refer to oligo #5 for location) and the 
sequences comprising the 374 bp Apa I/EcoR I fragment (human 
BMP-9 sequences) derived from phBMP9mex-l remain attached to 
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the pED6 backbone. The 915 bp Sal I/Xba I fragment comprising 
LINK-2 sequences and nucleotides #621-#1515 of murine BMP-9 
(SEQ ID NO:l) is ligated into the p324 clone from which the Sal 
I to Xba I sequences described above have been removed. 
5 The resulting plasmid is designated BMP9fusion and 

comprises LINK-2, nucleotides #621-#1551 of murine BMP-9 (SEQ 
ID NO:l) , nucleotides #35-#59 of LINK-1 (refer to the numbering 
of oligonucleotide #5), and the 374 bp Apa I/EcoR I fragment 
(human BMP-9) derived from clone pBMP9mex-l (described above) 

10 inserted between the Sal I and EcoR I sites of the mammalian 
cell expression vector pED6. 

BMP9 fusion is transfected into CHO cells using standard 
techniques known to those having ordinary skill in the art to 
create stable cell lines capable of expressing human BMP-9 

15 protein. The cell lines are cultured under suitable culture 
conditions and the BMP-9 protein is isolated and purified from 
the culture medium. 

EXAMPLE V 

20 Biologic al Activity of Expressed BMP-9 

To measure the biological activity of the expressed BMP-9 
proteins obtained in Example IV above, the proteins are 
recovered from the cell culture and purified by isolating the 
BMP-9 proteins from other proteinaceous materials with which 

25 they are co-produced as well as from other contaminants. The 
purified protein may be assayed in accordance with the rat bone 
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formation assay described in Example III. 

Purification is carried out using standard techniques 
known to those skilled in the art. It is contemplated, as with 
other BMP proteins, that purification may include the use of 
5 Heparin sepharose. 

Protein analysis is conducted using standard techniques 
such as SDS-PAGE acrylamide [U.K. Laemrali, Nature 227 :680 
(1970)] stained with silver [R.R. Oakley, et al. Anal. Biochem. 
105 :361 (1980)] and by immunoblot [H. Towbin, et al. Proc. 
10 Natl. Acad. Sci. USA 16:4350 (1979)] 

The foregoing descriptions detail presently preferred 
embodiments of the present invention. Numerous modifications 
and variations in practice thereof are expected to occur to 
those skilled in the art upon consideration of these descrip- 
15 tions. Those modifications and variations are believed to be 
encompassed within the claims appended hereto. 
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(1) GENERAL INFORMATION: 

(i) APPLICANT: Wozney, John M» 
Celeste, Anthony 

(ii) TITLE OF INVENTION: BMP-9 COMPOSITIONS 
(iii) NUMBER OF SEQUENCES: 9 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Genetics Institute, Inc. 

(B) STREET: Legal Affairs - 87 CambridgePark Drive 

(C) CITY: Cambridge 

(D) STATE: MA 

(E) COUNTRY: US 

(F) ZIP: 02140 . 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: Patentln Release #1.0 r Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Kapinos, Ellen J. 

(B) REGISTRATION NUMBER: 32 f 245 

(C) REFERENCE / DOCKET NUMBER: GI 5186A 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 876-1170 

(B) TELEFAX: (617) 876-5851 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2447 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mus musculus 

(B) STRAIN: C57B46XCBA 
(F) TISSUE TYPE: liver 
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(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Mouse liver cDNA 

(B) CLONE: ML14A 

(viii) POSITION IN GENOME: 

(C) UNITS: bp 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1564.. 1893 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 610.. 1896 

(ix) FEATURE: 

(A) NAME/KEY: mRNA 

(B) LOCATION: 1..2447 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

CATTAATAAA TATTAAGTAT TGGAATTAGT GAAATTGGAG TTCCTTGTGG AAGGAAGTGG 60 

GCAAGTGAGC TTTTTAGTTT GTGTCGGAAG CCTGTAATTA CGGCTCCAGC TCATAGTGGA 120 

ATGGCTATAC TTAGATTTAT GGATAGTTGG GTAGTAGGTG TAAATGTATG TGGTAAAAGG 180 

CCTAGGAGAT TTGTTGATCC AATAAATATG ATTAGGGAAA CAATTATTAG GGTTCATGTT 240 

CGTCCTTTTG GTGTGTGGAT TAGCATTATT TGTTTGATAA TAAGTTTAAC TAGTCAGTGT 300 

TGGAAAGAAT GGAGACGGTT GTTGATTAGG CGTTTTGAGG ATGGGAATAG GATTGAAGGA 360 

AATATAATGA TGGCTACAAC GATTGGGAAT CCTATTATTG TTGGGGTAAT GAATGAGGCA 420 

AATAGATTTT CGTTCATTTT AATTCTCAAG GGGTTTTTAC TTTTATGTTT GTTAGTGATA 480 

TTGGTGAGTA GGCCAAGGGT TAATAGTGTA ATTGAATTAT AGTGAAATCA TATTACTAGA 540 

CCTGATGTTA GAAGGAGGGC TGAAAAGGCT CCTTCCCTCC CAGGACAAAA CCGGAGCAGG 600 

GCCACCCGG ATG TCC CCT GGG GCC TTC CGG GTG GCC CTG CTC CCG CTG 648 
Met Ser Pro Gly Ala Phe Arg Val Ala Leu Leu Pro Leu 
-318 -315 -310 

TTC CTG CTG GTC TGT GTC ACA CAG CAG AAG CCG CTG CAG AAC TGG GAA 696 
Phe Leu Leu Val Cys Val Thr Gin Gin Lys Pro Leu Gin Asn Trp Glu 
-305 -300 -295 -290 

CAA GCA TCC CCT GGG GAA AAT GCC CAC AGC TCC CTG GGA TTG TCT GGA 744 
Gin Ala Ser Pro Gly Glu Asn Ala His Ser Ser Leu Gly Leu Ser Gly 
-285 -280 -275 

GCT GGA GAG GAG GGT GTC TTT GAC CTG CAG ATG TTC CTG GAG AAC ATG 792 
Ala Gly Glu Glu Gly Val Phe Asp Leu Gin Met Phe Leu Glu Asn Met 
-270 -265 -260 . 

AAG GTG GAT TTC CTA CGC AGC CTT AAC CTC AGC GGC ATT CCC TCC CAG 840 
Lys Val Asp Phe Leu Arg Ser Leu Asn Leu Ser Gly He Pro Ser Gin 
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(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 151 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

* Thr Arg Glu Cys Ser Arg Ser Cys Pro Arg Thr Ala Pro Gin Arg 
-41 -40 " -35 -30 

Gin Val Arg Ala Val Thr Arg Arg Thr Arg Met Ala Hie Val Ala Ala 
-25 -20 -15 -10 

Gly Ser Thr Leu Ala Arg Arg Lys Arg Ser Ala Gly Ala Gly Ser His 

-5 15 

Cys Gin Lys Thr Ser Leu Arg Val Asn Phe Glu Asp He Gly Trp Asp 
10 ' 15 20 

Ser Trp lie lie Ala Pro Lys Glu Tyr Glu Ala Tyr Glu Cys Lys Gly 
25 30 35 

Gly Cys Phe Phe Pro Leu Ala Asp Asp Val Thr Pro Thr Lys His Ala 
40 45 50 55 

He Val Gin Thr Leu Val His Leu Lys Phe Pro Thr Lys Val Gly Lys 

60 65 .70 

Ala Cys Cys Val Pro Thr Lys Leu Ser Pro He Ser Val Leu Tyr Lys 
75 80 85 

Asp Asp Met Gly Val Pro Thr Leu Lys Tyr His Tyr Glu Gly Met Ser 
90 95 100 

Val Ala Glu cys Gly Cys Arg 
105 110 
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(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..470 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..456 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 124.. 453 

(ix) FEATURE: 

(A) NAME/KEY: JnRNA 

(B) LOCATION: 1..470 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

TGA ACA AGA GAG TGC TCA AGA AGC TGT CCA AGG ACG GCT CCA 
* Thr Arg Glu Cys Ser Arg Ser cys Pro Arg Thr Ala Pro 
-41 -40 ~ 35 

CAG GTG AGA GCA GTC ACG AGG AGG ACA CGG ATG GCG CAC GTG 

Gin Val Arg Ala Val Thr Arg Arg Thr Arg Met Ala His vax 
-25 -20 

GGG TCG ACT TTA GCC AGG CGG AAA AGG AGC GCC GGG GCT GGC 
Gly ser Thr Leu Ala Arg Arg Lye Arg ser Ala Gly Aia exy 

-5 • <• 1 - 



CAG AGG 

Gin Arg 



GCT GCG 
Ala Ala 
-10 

AGC CAC 
Ser His 



TGT CAA AAG ACC TCC CTG 
Cys Gin Lys Thr Ser Leu 
10 

AGC TGG ATC ATT GCA CCC 
Ser Trp lie He Ala Pro 
25 

GGC TGC TTC TTC CCC TTG 
Gly Cys Phe Phe Pro Leu 
40 45 

ATC GTG CAG ACC CTG GTG 
He Val Gin Thr Leu Val 
60 



CGG GTA AAC TTC GAG GAC ATC GGC TGG GAC 
Arg Val Asn Phe Glu Asp lie Gly Trp Asp 
15 20 

AAG GAG TAT GAA GCC TAC GAG TGT AAG GGC 
Lys Glu Tyr Glu Ala Tyr Glu Cys Lys Gly 
30 . 35 

GCT GAC GAT GTG ACG CCG ACG AAA CAC GCT 
Ala Asp Asp Val Thr Pro Thr Lys His Ala 
50 55 

CAT CTC AAG TTC CCC ACA AAG GTG GGC AAG 

His Su Lys Phe Pro Thr Lys Val Gly Lys 

65 70 



S5SSSSSS2SS5SSSS5SSS 



75 



90 95 
GTG GCA GAG TGT GGG TGC AGG TAGTATCTGC CTGCGGG 
Val Ala Glu Cys Gly Cys Arg 
105 110 



48 



96 



144 



192 



240 



288 



336 



384 



432 



470 
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(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to BtRNA 



PCT/US92/05374 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 6: 
CTGCAGGCGA GCCTGAATTC CTCGAGCCAT CATG 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 68 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA to nRNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
CGAGGTTAAA AAACGTCTAG GCCCCCCGAA CCACGGGGAC GTGGTTTTCC TTTGAAAAAC 
ACGATTGC 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: C-terminal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(H) CELL LINE: W138 (genomic DNA) 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: human genomic library 

(B) CLONE: lambda 111-1 

(viii) POSITION IN GENOME: 

(C) UNITS: bp 
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Arg lie Asn lie Tyr Glu Val Met Lys Pro Pro Ala Glu Val Val Pro 

-115 -110 -105 

Gly His Leu lie Thr Arg Leu Leu Asp Thr Arg Leu Val His His Asn 
-100 -95 -90 -85 

Val Thr Arg Trp Glu Thr Phe Asp Val Ser Pro Ala Val Leu Arg Trp 
-80 " -75 -70 

Thr Arg Glu Lys Gin Pro Asn Tyr Gly Leu Ala lie Glu Val Thr His 
-65 -60 -55 

Leu His Gin Thr Arg Thr His Gin Gly Gin His Val Arg lie Ser Arg 
-50 -45 -40 

Ser Leu Pro Gin Gly Ser Gly Asn Trp Ala Gin Leu Arg Pro Leu Leu 
-35 -30 -25 

Val Thr Phe Gly His Asp Gly Arg Gly His Ala Leu Thr Arg Arg Arg 
-20 -15 -10 -5 

Arg Ala Lys Arg Ser Pro Lys His His Ser Gin Arg Ala Arg Lys Lys 

1 5 10 

Asn Lys Asn Cys Arg Arg His Ser Leu Tyr Val Asp Phe Ser Asp Val 
15 * 20 25 

Gly Trp Asn Asp Trp lie Val Ala Pro Pro Gly Tyr Gin Ala Phe Tyr 
30 " 35 40 

cys His Gly Asp Cys Pro Phe Pro Leu Ala Asp His Leu Asn Ser Thr 
45 " " 50 55 60 

Asn His Ala He Val Gin Thr Leu Val Asn Ser Val Asn Ser Ser He 

65 70 75 

Pro Lys Ala Cys cys Val Pro Thr Glu Leu Ser. Ala He Ser Met Leu 

80 85 90 

Tyr Leu Asp Glu Tyr Asp Lys Val Val Leu Lys Asn Tyr Gin Glu Met 
95 100 105 

Val Val Glu Gly Cys Gly Cys Arg 
110 H5 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
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TGT GGG TGC CGC TGAGATCAGG CAGTCCTTGA GGATAGACAG ATATACACAC 1666 

Cys Gly Cys Arg 
115 

CACACACACA CACCACATAC ACCACACACA CACGTTCCCA TCCACTCACC CACACACTAC 1726 

ACAGACTGCT TCCTTATAGC TGGACTTTTA TTTAAAAAAA AAAAAAAAAA AATGGAAAAA 1786 

ATCCCTAAAC ATTCACCTTG ACCTTATTTA TGACTTTACG TGCAAATGTT TTGACCATAT 1846 

TGATCATATA TTTTGACAAA ATATATTTAT AACTACGTAT TAAAAGAAAA AAATAAAATG 1906 

AGTCATTATT TTAAAAAAAA AAAAAAAACT CTAGAGTCGA CGGAATTC 1954 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 408 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met lie Pro Gly Asn Arg Met Leu Met Val Val Leu Leu Cys Gin Val 
-292 -290 -285 -280 

Leu Leu Gly Gly Ala Ser His Ala Ser Leu lie Pro Glu Thr Gly Lys 
-275 -270 -265 

Lys Lys Val Ala Glu He Gin Gly His Ala Gly Gly Arg Arg Ser Gly 
-260 -255 -250 -245 

Gin Ser His Glu Leu Leu Arg Asp Phe Glu Ala Thr Leu Leu Gin Met 
-240 -235 -230 

Phe Gly Leu Arg Arg Arg Pro Gin Pro Ser Lys Ser Ala Val He Pro 
-225 -220 -215 

Asp Tyr Met Arg Asp Leu Tyr Arg Leu Gin Ser Gly Glu Glu Glu Glu 
-210 -205 -200 

Glu Gin He His Ser Thr Gly Leu Glu Tyr Pro Glu Arg Pro Ala Ser 
-195 -190 -185 

Arg Ala Asn Thr Val Arg Ser Phe His His Glu Glu His Leu Glu Asn 
-180 -175 -170 -165 

He Pro Gly Thr Ser Glu Asn Ser Ala Phe Arg Phe Leu Phe Asn Leu 
-160 -155 -150 

Ser Ser He Pro Glu Asn Glu Val He Ser Ser Ala Glu Leu Arg Leu 
-145 -140 -135 

Phe Arg Glu Gin Val Asp Gin Gly Pro Asp Trp Glu Arg Gly Phe His 
-130 -125 -120 
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GTG GAC CAG GGC CCT GAT TGG GAA AGG GGC TTC CAC CGT ATA AAC ATT 942 

Val Asp Gin Gly Pro Asp Trp Glu Arg Gly Phe His Arg lie Asn lie 
-125 -120 -115 

TAT GAG GTT ATG AAG CCC CCA GCA GAA GTG GTG CCT GGG CAC CTC ATC 990 
Tyr Glu Val Met Lys Pro Prb Ala Glu Val Val Pro Gly His Leu lie 
-110 t105 -100 

ACA CGA CTA CTG GAC ACG AGA CTG GTC CAC CAC AAT GTG ACA CGG TGG 1038 
Thr Arg Leu Leu Asp Thr Arg Leu Val His His Asn Val Thr Arg Trp 
-95 -90 -85 

GAA ACT TTT GAT GTG AGC CCT GCG GTC CTT CGC TGG ACC CGG GAG AAG 1086 
Glu Thr Phe Asp Val Ser Pro Ala Val Leu Arg Trp Thr Arg Glu Lys 
-80 -75 -70 -65 

CAG CCA AAC TAT GGG CTA GCC ATT GAG GTG ACT CAC CTC CAT CAG ACT 1134 
Gin Pro Asn Tyr Gly Leu Ala lie Glu Val Thr His Leu His Gin Thr 
-60 -55 -50 

CGG ACC CAC CAG GGC CAG CAT GTC AGG ATT AGC CGA TCG TTA CCT CAA 1182 
Arg Thr His Gin Gly Gin His Val Arg lie Ser Arg Ser Leu Pro Gin 

-45 -40 -35 

r • ■ - 

GGG AGT GGG AAT TGG GCC CAG CTC CGG CCC CTC 6TG GTC ACC TTT GGC 1230 

Gly Ser Gly Asn Trp Ala Gin Leu Arg Pro Leu Leu Val Thr Phe Gly 

-30 -25 -20 

CAT GAT GGC CGG GGC CAT GCC TTG ACC CGA CGC CGG AGG GCC AAG CGT 1278 
His Asp Gly Arg Gly His Ala Leu Thr Arg Arg Arg Arg Ala Lys Arg 
-15 -10 -5 

AGC CCT AAG CAT CAC TCA CAG CGG GCC AGG AAG AAG AAT AAG AAC TGC 1326 
Ser Pro Lys His His Ser Gin Arg Ala Arg Lys Lys Asn Lys Asn cys 
15 10 15 

A 

CGG CGC CAC TCG CTC TAT GTG GAC TTC AGC GAT GTG GGC TGG AAT GAC 1374 
Arg Arg His Ser Leu Tyr Val Asp Phe Ser Asp Val Gly Trp Asn Asp 
20 25 30 

TGG ATT GTG GCC CCA CCA GGC TAC CAG GCC TTC TAC TGC CAT GGG GAC 1422 
Trp He Val Ala Pro Pro Gly Tyr Gin Ala Phe Tyr Cys His Gly Asp 
35 40 r- 45 

TGC CCC TTT CCA CTG GCT GAC CAC CTC AAC TCA ACC AAC CAT GCC ATT 1470 
Cys Pro Phe Pro Leu Ala Asp His Leu Asn Ser Thr Asn His Ala He 
50 55 60 

GTG CAG ACC CTG GTC AAT TCT GTC AAT TCC AGT ATC CCC AAA GCC TGT 1518 
Val Gin Thr Leu Val Asn Ser Val Asn Ser Ser He Pro Lys Ala Cys 
65 70 75 80 

TGT GTG CCC ACT GAA CTG AGT GCC ATC TCC ATG CTG TAC CTG GAT GAG 1566 
Cys Val Pro Thr Glu Leu Ser Ala He Ser Met Leu Tyr Leu Asp Glu 

85 90 95 

TAT GAT AAG GTG GTA CTG AAA AAT TAT CAG GAG ATG GTA GTA GAG GGA 1614 
Tyr Asp Lys Val Val Leu Lys Asn Tyr Gin Glu Met Val Val Glu Gly 
100 105 HO 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



uTCTAGAGGG 


r\ H M » MM % MM % 

CA6A6GA6GA 


f% m jm% » mm m m MM 

GGGAGGGAGG 


m 4ft M M M^^MM 

GAAGGAGCGC 


■M M » MMMMMMM 

GGAGCCCGGC 


M;M.MM% ■» MMM«« 

CCGGAAGCTA 


60 


GGTGAGTGTG 


GCATCCGAGC 


TGAGGGACGC 


GAGCCTGAGA 


rf»«M MMM MCT1M MfTl 

CGCCGCTGCT 


M MjfWMjMjMjM / WWW 

GCTCCGGCTG 


120 


AGTATCTAGC 


TTGTCTCCCC 


GATGGGATTC 


CCGTCCAAGC 


TATCTCGAGC 


CTGCAGCGCC 


180 


ACAGTCCCCG 


GCCCTCGCCC 


AGGTTCACTG 


CAACCGTTCA 


GAGGTCCCCA 


GGAGCTGCTG 


240 


CTGGCGAGCC 


CGCTACTGCA 


GGGACCTATG 


GAGCCATTCC 


GTAGTGCCAT 


CCCGAGCAAC 


300 


GCACTGCTGC 


AGCTTCCCTG 


AGCCTTTCCA 


GCAAGTTTGT 


TCAAGATTGG 


CTGTCAAGAA 


360 


TCATGGACTG 


TTATTATATG 


CCTTGTTTTC 


TGTCAAGACA 


CC ATG ATT CCT GGT 
Met He Pro Gly 
-292 -290 


414 



AAC CGA ATG CTG ATG GTC GTT TTA TTA TGC CAA GTC CTG CTA GGA GGC 462 
Asn Arg Met Leu Met Val Val Leu Leu Cys Gin Val Leu Leu Gly Gly 
-285 -280 -275 



GCG AGC CAT GCT 
Ala Ser His Ala 
-270 


AGT 

Ser 


TTG 
Leu 


ATA 
He 


CCT GAG 
Pro Glu 
-265 


ACG 
Thr 


GGG 
Gly 


AAG 
Lys 


AAA AAA 
Lys Lys 
-260 


GTC 

val 


GCC 
Ala 


510 


GAG 
Glu 


ATT GAG 
He Gin 
-255 


GGC 

Gly 


CAC 
His 


GCG 
Ala 


GGA GGA 
Gly Gly 
-250 


CGC 

Arg 


CGC 

Arg 


TCA 
Ser 


GGG CAG 

Gly Gin 
-245 


AGC 
Ser 


CAT 
His 


GAG 
GlU 


558 


CTC CTG 
Leu Leu 
-240 


CGG 
Arg 


GAC 
Asp 


TTC 
Phe 


GAG GCG 
Glu Ala 
-235 


ACA 
Thr 


CTT 
Leu 


CTG 
Leu 


CAG ATG 
Gin Met 
-230 


TTT 
Phe 


GGG 
Gly 


CTG 
Leu 


CGC 
Arg 
-225 


606 


CGC 
Arg 


CGC 
Arg 


CCG 
Pro 


CAG 
Gin 


CCT AGC 
Pro Ser 
-220 


AAG 
Lys 


AGT 
Ser 


GCC 
Ala 


GTC ATT 
Val He 
-215 


CCG 
Pro 


GAC 
Asp 


TAC 
Tyr 


ATG CGG 
Met Arg 
-210 


654 


GAT 
Asp 


CTT 
Leu 


TAC 
Tyr 


CGG CTT 
Arg Leu 
-205 


CAG 
Gin 


TCT 
Ser 


GGG 
Gly 


GAG GAG 
GlU GlU 
-200 


GAG 
Glu 


GAA 
GlU 


GAG 
Glu 


CAG ATC 
Gin He 
-195 


CAC 
His 


702 


AGC 
Ser 


ACT 
Thr 


GGT CTT 
Gly Leu 
-190 


GAG 
GlU 


TAT 
Tyr 


CCT 
Pro 


GAG CGC 
Glu Arg 
-185 


CCG 
Pro 


GCC 
Ala 


AGC 
Ser 


CGG GCC 

Arg Ala 
-180 


AAC 
Asn 


ACC 
Thr 


750 


GTG 
Val 


AGG AGC 
Arg Ser 
-175 


TTC 
Phe 


CAC 
His 


CAC 
His 


GAA GAA CAT 
Glu GlU His 
-170 


CTG 
Leu 


GAG 
Glu 


AAC ATC CCA GGG 
Asn He Pro Gly 
-165 


ACC 
Thr 


798 


AGT GAA 
Ser Glu 
-160 


AAC 
Asn 


TCT 
Ser 


GCT 
Ala 


TTT CGT TTC CTC 
Phe Arg Phe Leu 
-155 


TTT 
Phe 


AAC CTC 
Asn Leu 
-150 


AGC 
Ser 


AGC 
Ser 


ATC 
He 


CCT 
Pro 
-145 


846 


GAG 
Glu 


AAC 
Asn 


GAG 
GlU 


GTG 

val 


ATC TCC 
He Ser 
-140 


TCT 
Ser 


GCA 
Ala 


GAG 
Glu 


CTT CGG 
Leu Arg 
-135 


CTC 
Leu 


TTC 
Phe 


CGG 
Arg 


GAG CAG 
Glu Gin 
-130 


894 
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Gly Ala Ser Ser His Cys Gin Lys Thr Ser Leu Arg Val Asn Phe Glu 
5 10 15 

Asp He Gly Trp Asp Ser Trp He lie Ala Pro Lys Glu Tyr Asp Ala 
20 25 30 

Tyr Glu Cys Lys Gly Gly Cys Phe Phe Pro Leu Ala Asp Asp Val Thr 
35 ^ 40 45 50 

Pro Thr Lys His Ala He Val Gin Thr Leu Val His Leu Glu Phe Pro 

55 60 65 

Thr Lys Val Gly Lys Ala Cys Cys Val Pro Thr Lys Leu Ser Pro He 
70 75 80 

Ser He Leu Tyr Lys Asp Asp Met Gly Val Pro Thr Leu Lys Tyr His 
85 90 95 

Tyr Glu Gly Met Ser Val Ala Glu Cys Gly Cys Arg 
100 105 110 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1954 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(G) CELL TYPE: Osteosarcoma Cell Line 

(H) CELL LINE: U-20S 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: U20S cDNA in Lambda gtlO 

(B) CLONE: Lambda U20S-3 

(viii) POSITION IN GENOME: 

(C) UNITS: bp 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 403 ..1629 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1279.. 1626 

(ix) FEATURE: 

(A) NAME/KEY: mRNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ser Pro Gly Ala Phe Arg Val Ala Leu Leu Pro Leu Phe Leu Leu 
-318 -315 -310 -305 

Val Cys Val Thr Gin Gin Lys Pro Leu Gin Asn Trp Glu Gin Ala Ser 
-300 -295 -290 

Pro Gly Glu Asn Ala His Ser Ser Leu Gly Leu Ser Gly Ala Gly Glu 
-285 -280 -275 

Glu Gly Val Phe Asp Leu Gin Met Phe Leu Glu Asn Met Lys Val Asp 
-270 -265 -260 -255 

Phe Leu Arg Ser Leu Asn Leu Ser Gly lie Pro Ser Gin Asp Lys Thr 
-250 -245 -240 

Arg Ala Glu Pro Pro Gin Tyr Met lie Asp Leu Tyr Asn Arg Tyr Thr 
-235 ^230 -225 

Thr Asp Lys Ser Ser Thr Pro Ala Ser Asn lie Val Arg Ser Phe Ser 
-220 -215 -210 

Val Glu Asp Ala lie Ser Thr Ala Ala Thr Glu Asp Phe Pro Phe Gin 
-205 -200 -195 

Lys His He Leu He Phe Asn He Ser He Pro Arg His Glu Gin He 

-190 -185 -180 -175 

Thr Arg Ala Glu Leu Arg Leu Tyr Val Ser Cys Gin Asn Asp Val Asp 
-170 -165 -160 

Ser Thr His Gly Leu Glu Gly Ser Met Val Val Tyr Asp Val Leu Glu 
-155 -150 -145 

Asp Ser Glu Thr Trp Asp Gin Ala Thr Gly Thr Lys Thr Phe Leu Val 
-140 -135 -130 

Ser Gin Asp He Arg Asp Glu Gly Trp Glu Thr Leu Glu Val Ser Ser 
-125 -120 -H5 

Ala Val Lys Arg Trp Val Arg Ala Asp Ser Thr Thr Asn Lys Asn Lys 
-110 * -105 -100 -95 

Leu Glu Val Thr Val Gin Ser His Arg Glu Ser Cys Asp Thr Leu Asp 
-90 -85 -80 

He Ser Val Pro Pro Gly Ser Lys Asn Leu Pro Phe Phe Val Val Phe 
-75 -70 -65 

Ser Asn Asp Arg Ser Asn Gly Thr Lys Glu Thr Arg Leu Glu Leu Lys 
-60 -55 -50 

Glu Met He Gly His Glu Gin Glu Thr Met Leu Val Lys Thr Ala Lys 
-45 " -40 -35 

Asn Ala Tyr Gin Val Ala Gly Glu Ser Gin Glu Glu Glu Gly Leu Asp 
-30 -25 -20 -15 

Gly Tyr Thr Ala Val Gly Pro Leu Leu Ala Arg Arg Lys Arg Ser Thr 
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-15 -10 -5 

AGG AGC ACC GGA GCC AGC AGC CAC TGC CAG AAG ACT TCT CTC AGG GTG 1608 
Arg- Ser Thr Gly Ala Ser Ser His Cys Gin Lys Thr Ser Leu Arg Val 
1 5 10 15 

AAC TTT GAG GAC ATC GGC TGG GAC AGC TGG ATC ATT GCA CCC AAG GAA 1656 
Asn Phe Glu Asp lie Gly Trp Asp Ser Trp lie lie Ala Pro Lys Glu 
20 *" 25 30 

TAT GAC GCC TAT GAG TGT AAA GGG GGT TGC TTC TTC CCA TTG GCT GAT 1704 
Tyr Asp Ala Tyr Glu Cys Lys Gly Gly Cys Phe Phe Prb Leu Ala Asp 
35-40 "45 

GAC GTG ACA CCC ACC AAA CAT GCC ATC GTG CAG ACC CTG GTG CAT CTC 1752 
Asp Val Thr Pro Thr Lys His Ala lie Val Gin Thr Leu Val His Leu 
50 55 60 

GAG TTC CCC ACA AAG GTG GGC AAA GCC TGC TGC GTT CCC ACC AAA CTG 1800 
Glu Phe Pro Thr Lys Val Gly Lys Ala Cys Cys Val Pro Thr Lys Leu 
65 70 75 

AGT CCC ATC TCC ATC CTC TAC AAG GAT GAC ATG GGG GTG CCA ACC CTC 1848 
Ser Pro lie Ser lie Leu Tyr Lys Asp Asp Met Gly Val Pro Thr Leu 
80 85 90 95 



AAG TAC CAC TAT GAG GGG ATG AGT GTG GCT GAG 
Lys Tyr His Tyr Glu Gly Met Ser Val Ala Glu 
.100 105 


TGT GGG TGT AGG TAGTCCCTGC 
Cys Gly Cys Arg 

110 


AGCCACCCAG 


GGTGGGGATA 


CAGGACATGG 


AAGAGGTTCT 


GGTACGGTCC TGCATCCTCC 


1963 


TGCGCATGGT 


ATGCCTAAGT 


TGATCAGAAA 


CCATCCTTGA 


GAAGAAAAGG AGTTAGTTGC 


2023 


CCTTCTTGTG 


TCTGGTGGGT 


CCCTCTGCTG 


AAGTGACAAT 


GACTGGGGTA TGCGGGCCTG 


2083 


TGGGCAGAGC 


AGGAGACCCT 


GGAAGGGTTA 


GTGGGTAGAA 


AGATGTGAAA AAGGAAGCTG 


2143 


TGGGTAGATG 


ACCTGCACTC 


CAGTGATTAG 


AAGTCCAGCC 


TTACCTGTGA GAGAGCTCCT 


2203 


GGCATCTAAG 


AGAACTCTGC 


TTCCTCATCA 


TCCCCACCGA 


CTTGTTCTTC CTTGGGAGTG 


2263 


TGTCCTCAGG 


GAGAACAGCA 


TTGCTGTTCC 


TGTGCCTCAA 


GCTCCCAGCT GACTCTCCTG 


2323 


TGGCTCATAG 


GACTGAATGG 


GGTGAGGAAG 


AGCCTGATGC 


CCTCTGGCAA TCAGAGCCCG 


2383 


AAGGACTTCA 


AAACATCTGG 


ACAACTCTCA 


TTGACTGATG 


CTCCAACATA ATTTTTAAAA 


2443 


AGAG 










2447 



1903 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 428 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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-255 -250 -245 

GAC AAA ACC AGA GCG GAG CCA CCC CAG TAC ATG ATC GAC TTG TAC AAC 888 
Asp Lys Thr Arg Ala Glu Pro Pro Gin Tyr Met lie Asp Leu Tyr Asn 
-240 -235 -230 

AGA TAC ACA ACG GAC AAA TCG TCT ACG CCT GCC TCC AAC ATC GTG CGG 936 
Arg Tyr Thr Thr Asp Lys Ser Ser Thr Pro Ala Ser Asn lie Val Arg 
-225 -220 -215 -210 

AGC TTC AGC GTG GAA GAT GCT ATA TCG ACA GCT GCC ACG GAG GAC TTC 984 
Ser Phe Ser Val Glu Asp Ala He Ser Thr Ala Ala Thr Glu Asp Phe 
-205 -200 -195 

CCC TTT CAG AAG CAC ATC CTG ATC TTC AAC ATC TCC ATC CCG AGG CAC 1032 
Pro Phe Gin Lys His He Leu He Phe Asn He Ser He Pro Arg His 
-190 -185 -180 

GAG CAG ATC ACC AGG GCT GAG CTC CGA CTC TAT GTC TCC TGC CAA AAT 1080 
Glu Gin He Thr Arg Ala Glu Leu Arg Leu Tyr Val Ser Cys Gin Asn 
-175 -170 -165 

GAT GTG GAC TCC ACT CAT GGG CTG GAA GGA AGC ATG GTC GTT TAT GAT 1128 
Asp Val Asp Ser Thr His Gly Leu Glu Gly Ser Met Val Val Tyr Asp 
-160 -155 -150 

GTT CTG GAG GAC AGT GAG ACT TGG GAC CAG GCC ACG GGG ACC AAG ACC 1176 
Val Leu Glu Asp Ser Glu Thr Trp Asp Gin Ala Thr Gly Thr Lys Thr 
-145 -140 -135 -130 

TTC TTG GTA TCC CAG GAC ATT CGG GAC GAA GGA TGG GAG ACT TTA GAA 1224 
Phe Leu Val Ser Gin Asp He Arg Asp Glu Gly Trp Glu Thr Leu Glu 
-125 -120 -115 

GTA TCG AGT GCC GTG AAG CGG TGG GTC AGG GCA GAC TCC ACA ACA AAC 1272 
Val Ser Ser Ala Val Lys Arg Trp Val Arg Ala Asp Ser Thr Thr Asn 
-110 -105 -100 

AAA AAT AAG CTC GAG GTG ACA GTG CAG AGC CAC AGG GAG AGC TGT GAC 1320 
Lys Asn Lys Leu Glu Val Thr Val Gin Ser His Arg Glu Ser Cys Asp 
-95 -90 -85 

ACA CTG GAC ATC AGT GTC CCT CCA GGT TCC AAA AAC CTG CCC TTC TTT 1368 
Thr Leu Asp He Ser Val Pro Pro Gly Ser Lys Asn Leu Pro Phe Phe 
-80 -75 -70 

GTT GTC TTC TCC AAT GAC CGC AGC AAT GGG ACC AAG GAG ACC AGA CTG 1416 
Val Val Phe Ser Asn Asp Arg Ser Asn Gly Thr Lys Glu Thr Arg Leu 
-65 -60 -55 -50 

GAG CTG AAG GAG ATG ATC GGC CAT GAG CAG GAG ACC ATG CTT GTG AAG 1464 
Glu Leu Lys Glu Met He Gly His Glu Gin Glu Thr Met Leu Val Lys 
-45 -40 -35 

ACA GCC AAA AAT GCT TAC CAG GTG GCA GGT GAG AGC CAA GAG GAG GAG 1512 
Thr Ala Lys Asn Ala Tyr Gin Val Ala Gly Glu Ser Gin Glu Glu Glu 
-30 -25 -20 

GGT CTA GAT GGA TAC ACA GCT GTG GGA CCA CTT TTA GCT AGA AGG AAG 1560 
Gly Leu Asp Gly Tyr Thr Ala Val Gly Pro Leu Leu Ala Arg Arg Lys 
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What is claimed is: 

1. A BMP-9 polypeptide comprising the amino acid sequence from 
amino acid #8 - 110 as set forth in FIG. 3 (SEQ ID NO: 9) . 

2 . A BMP-9 polypeptide comprising the amino acid sequence from 
amino acid #1 - 110 as set forth in FIG. 3 (SEQ ID NO: 9) . 

3. A BMP-9 polypeptide of claim 1 wherein said polypeptide is 
a dimer wherein each subunit comprises at least the amino acid 
sequence from amino acid #8 - 110 of FIG. 3 (SEQ ID NO: 9) . 

4. A BMP-9 polypeptide of claim 2 wherein said polypeptide is 
a dimer wherein each subunit comprises at least the amino acid 
sequence from amino acid #1-110 of FIG. 3. (SEQ ID NO: 9). 

5. A purified BMP-9 protein produced by the steps of 

(a) culturing a cell transformed with a cDNA comprising 
the nucleotide sequence from nucleotide #124 to #453 as shown 
in FIG. 3 (SEQ ID NO: 8) ; and 

(b) recovering and purifying from said culture medium a 
protein comprising the amino acid sequence from amino acid #1 
to amino acid #110 as shown in FIG. 3 (SEQ ID NO: 9) . 

6. A purified BMP-9 protein produced by the steps of 

(a) culturing a cell transformed with a cDNA comprising 
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the nucleotide sequence from nucleotide #124 to #453 as shown 
in FIG. 3 (SEQ ID NO: 8); and 

(b) recovering form said culture medium a protein 
comprising an amino acid sequence from amino acid #8 to amino 
acid #110 as shown in Figure 3 (SEQ ID NO: 9) . 

7. A BMP-9 protein characterized by the ability to induce the 
formation of cartilage and/or bone, 

8. A DNA sequence encoding a BMP-9 protein. 

9. The DNA sequence of claim 8 wherein said DNA comprises 

(a) nucleotide 124 to 453 (SEQ ID NO: 8); and 

(b) sequences which hybridize thereto under stringent 
hybridization conditions and exhibit the ability to form 
cartilage and/ or bone. 

10. The DNA sequence of claim 8 wherein said DNA comprises 

(a) nucleotide 145 to 453 (SEQ ID NO: 8); and 

(b) sequences which hybridize thereto under stringent 
hybridization conditions and exhibit the ability to form 
cartilage and/ or bone. 

11. A host cell transformed with a DNA sequence encoding BMP- 
8. 
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12. A method for producing a purified BMP-9 protein said 
method comprising the steps of 

(a) culturing a cell transformed with a cDNA comprising 
the nucleotide sequence encoding a BMP-9 protein; and 

(b) recovering and purifying said BMP-9 protein from the 
culture medium. 

13. A pharmaceutical composition comprising an effective 
amount of a BMP-9 protein in admixture with a pharmaceutically 
acceptable vehicle. 

14. A composition of claim 13 further comprising a matrix for 
supporting said composition and providing a surface for bone 
and/or cartilage growth. 

15. The composition of claim 14 wherein said matrix comprises 
a material selected from the group consisting of 
hydroxyapatite, collagen, poly lactic acid and tricalcium 
phosphate. 

16. A method for inducing bone and/or cartilage formation in a 
patient in need of same comprising administering to said 
patient an effective amount of the composition of claim 13. 

17. A pharmaceutical composition for wound healing and tissue 
repair said composition comprising an effective amount of the 
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protein of a BMP-9 protein in a pharmaceutical^ acceptable 
vehicle. 



18. A method for treating wounds and/or tissue repair in a 
patient in need of same comprising administering to said 
patient an effective amount of the composition of claim 17. 
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Figure 1A 

10 20 30 40 50 60 70 

CATTAATAAA TATTAAGTAT TGGAATTAGT GAAATTGGAG TTCCTTGTGG AAGGAAGTGG GCAAGTGAGC 

80 90 100 HO 120 130 140 

TTTTTAGTTT GTGTCGGAAG CCTGTAATTA CGGCTCCAGC TCATAGTGGA ATGGCTATAC TTAGATTTAT 

150 160 170 180 190 200 210 

GGATAGTTGG GTAGTAGGTG TAAATGTATG TGGTAAAAGG CCTAGGAGAT TTGTTGATCC AATAAATATG 

220 230 240 250 260 270 280 

ATTAGGGAAA CAATTATTAG GGTTCATGTT CGTCCTTTTG GTGTGTGGAT TAGCATTATT TGTTTGATAA 

290 300 310 320 330 340 350 

TAAGTTTAAC TAGTCAGTGT TGGAAAGAAT GGAGACGGTT GTTGATTAGG CGTTTTGAGG ATGGGAATAG 

360 370 380 390 400 410 420 

GATTGAAGGA AATATAATGA TGGCTACAAC GATTGGGAAT CCTATTATTG TTGGGGTAAT GAATGAGGCA 

430 440 450 460 476 480 490 

AATAGATTTT CGTTCATTTT AATTCTCAAG GGGTTTTTAC TTTTATGTTT GTTAGTGATA TTGGTGAGTA 

500 510 520 530 540 550 560 

GGCCAAGGGT TAATAGTGTA ATTGAATTAT AGTGAAATCA TATTACTAGA CCTGATGTTA GAAGGAGGGC 
570 580 590 600 609 618 

> 

TGAAAAGGCT CCTTCCCTCC CAGGACAAAA CCGGAGCAGG GCCACCCGG ATG TCC CCT GGG 

M S P G 

627 636 645 654 663 672 

GCC TTC CGG GTG GCC CTG CTC CCG CTG TTC CTG CTG GTC TGT GTC ACA GAG CAG 
AFRVALLPLFLLVCVTQQ 

681 690 699 708 717 726 

AAG CCG CTG CAG AAC TGG GAA CAA GCA TCC CCT GGG GAA AAT GCC CAC AGC TCC 
KPLQNWEQASPGENAHSS 

735 744 753 762 771 780 



CTG GGA TTG TCT GGA GCT GGA GAG GAG GGT GTC TTT GAC CTG CAG ATG TTC CTG 
LGLSGAGEEGVFDLQMFL 

789 798 807 816 825 834 

GAG AAC ATG AAG GTG GAT TTC CTA CGC AGC CTT AAC CTC AGC GGC ATT CCC TCC 
ENMKVDFLRSLNLSGIPS 
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Figure IB 

843 852 861 870 879 888 

CAG GAC AAA ACC AGA GCG GAG CCA CCC CAG TAC ATG ATC GAC TTG TAC AAC AGA 
QDKTRAEPPQYMIDLYNR 

897 906 915 924 933 942 

TAC ACA ACG GAC AAA TCG TCT ACG CCT GCC TCC AAC ATC GTG CGG AGC TTC AGC 
YTTDKSSTPASNIVRSFS 

951 960 969 978 987 996 

GTG GAA GAT GCT ATA TCG ACA GCT GCC ACG GAG GAC TTC CCC TTT CAG AAG CAC 
VEDAISTAATEDFPFQKH 

1005 1014 1023 1032 1041 1050 

ATC CTG ATC TTC AAC ATC TCC ATC CCG AGG CAC GAG CAG ATC ACC AGG GCT GAG 
I ^IFNISIPRHEQITRAE 

1059 1068 1077 1086 1095 1104 

CTC CGA CTC TAT GTC TCC TGC CAA AAT GAT GTG GAC TCC ACT CAT GGG CTG GAA 
kRLYVSCQNDVDSTHGLE 

1113 1122 1131 1140 1149 1158 

GGA AGC ATG GTC GTT TAT GAT GTT CTG GAG GAC AGT GAG ACT TGG GAC CAG GCC 
GSMVVYDVL E^DSETWDQA 

H67 1176 1185 1194 1203 1212 

ACG GGG ACC AAG ACC TTC TTG GTA TCC CAG GAC ATT CGG GAC GAA GGA TGG GAG 
TGTKTFLVSQDIRDE GWE 

1221 1230 1239 1248 1257 1266 

ACT TTA GAA GTA TCG AGT GCC GTG AAG CGG TGG GTC AGG GCA GAC TCC ACA ACA 
TLEVSSAVKRWVRADSTT 

1275 1284 1293 1302 1311 1320 

AAC AAA AAT AAG CTC GAG GTG ACA GTG CAG AGC CAC AGG GAG AGC TGT GAC ACA 
NKNKLEVTVQSHRESCDT 

1329 1338 1347 1356 1365 1374 

CTG GAC ATC AGT GTC CCT CCA GGT TCC AAA AAC CTG CCC TTC TTT GTT GTC TTC 
LDISVPPGSKNLPFFVVF 
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Figure 1C 

^ X H! 1401 1410 1419 " 2 8 

r r? c R Gc ff f f f f f f p f p f f f 

1437 1446 wcr 
1455 1464 1473 1482 

AIC^ppp^ppp _______ 

W^-TMLVKTAKNAYQ 
__ 14 ^ 1509 1518 1527 1536 

y£ EEGLDG YTAVG 

^ 15 ^ 1563 1572 1581 1590 

CCA CTT TTA GCT AGA AGG AAG AGG AGC ACC GGA GCC AGC AGC CAC TGC CAG AAG 
&KRK RSTGASSHCQK 

1599 1608 iei7 319) ™ (326 > 
I 617 1626 1635 1644 

ACT TCT CTC ^TG AAC TTT GAG GAC ATC GGC TGG GAC AGC TGG ATC ATT GCA 

NFE D.IGWDSWIIA 

^ ^ 1671 1680 "89 1698 

? CC ^ ? T jF P ~ _ 555 555 ~ 555 ~ ~ CCA TTG GCT 

- - Ay ECKGGCFFPLA 

"25 'l734 1743 1752 

GAT GAC GTG ACA CCC ACC AAA CAT GCC ATC GTG CAG ACC CTG GTG CAT CTC GAG 
TPTK HAIVQTLV HLE 

l ™ 1770 1779 1788 I 7 " 1806 

TTC CCC ACA GTG GGC AAA GCC TGC TGC GTT CCC ACC AAA CTG AGT CCC ATC 

^ KVGK ACCVPTKLSPI 

^ 1824 1833 1842 1851 I860 

TCC ATC CTC TAC AAG GAT GAC ATG GGG GTG CCA ACC CTC AAG TAC CAC TAT GAG 

DMGVPTLKYHYE 

^ 1878 1887 1903 1913 1923 

GGG ATG AGT GTG GCT GAG TGT GGG TGT AGG TAGTCCCTGC AGCCACCCAG GGTGGGGATA 
• .-1 E C G C R 

(428) 
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Figure ID 

1963 1973 1983 1993 

GCATCCTCC TGCGCATGGT ATGCCTAAGT TGATCAGAAA 
2033 2043 2053 2063 

CTTCTTGTG TCTGGTGGGT CCCTCTGCTG AAGTGACAAT 
2103 2113 2123 2133 

GGAGACCCT GGAAGGGTTA GTGGGTAGAA AGATGTCAAA 
2173 2183 2193 2203 

AGTGATTAG AAGTCCAGCC TTACCTGTGA GAGAGCTCCT 
2243 2253 2263 2273 

CCCCACCGA CTTGTTCTTC CTTGGGAGTG TGTCCTCAGG 
2313 2323 2333 2343 

wTCCCAGCT GACTCTCCTG TGGCTCATAG GACTGAATGG 
2383 2393 2403 2413 

CAGAGCCCG AAGGACTTCA AAACATCTGG ACAACTCTCA 
2423 2433 2443 

TTGACTGATG CTCCAACATA ATTTTTAAAA AGAG 



1933 


1943 


1953 


CAGGACATGG 


AAGAGGTTCT 


GGTACGGTCC 


2003 


2013 


2023 


CCATCCTTGA 


GAAGAAAAGG 


AGTTAGTTGC 


2073 


2083 


2093 


GACTGGGGTA 


TGCGGGCCTG 


TGGGCAGAGC 


2143 


2153 


2163 


AAGGAAGCTG 


TGGGTAGATG 


ACCTGCACTC 


2213 


2223 


2233 


GGCATCTAAG 


AGAACTCTGC 


TTCCTCATCA 


2283 


2293 


2303 


GAGAACAGCA 


TTGCTGTTCC 


TGTGCCTCAA 


2353 


2363 


2373 


GGTGAGGAAG 


AGCCTGATGC 


CCTCTGGCAA 
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Figure 2 

10 20 30 40 50 60 7n 

CTCTAGAGGG CAGAGGAGGA GGGAGGGAGG GAAGGAGCGC GGAGCCCGGC CCGGAAGCTA GGTGAGTGTG 

80 90 100 HO 120 130 lAn 

GCATCCGAGC TGAGGGACGC GAGCCTGAGA CGCCGCTGCT GCTCCGGCTG AGTATCTAGC TTGTCTCCCC 

150 160 I 70 180 190 200 tin 

GATGGGATTC CCGTCCAAGC TATCTCGAGC CTGCAGCGCC ACAGTCCCCG GCCCTCGCCC AGGTTCACTG 

220 230 240 250 260 270 -»an 

CAACCGTTCA GAGGTCCCCA GGAGCTGCTG CTGGCGAGCC CGCTACTGCA GGGACCTATG GAGCCATTCC 

290 300 310 320 330 340 inn 

GTAGTGCCAT CCCGAGCAAC GCACTGCTGC AGCTTCCCTG AGCCTTTCCA GCAAGTTTGT TCAAGATTGG 

360 370 380 390 400 (1) 

CTGTCAAGAA TCATGGACTG TTATTATATG CCTTGTTTTC TGTCAAGACA CC ATG ATT CCT 

MET He Pro 

417 432 447 462 

GGT AAC CGA ATG CTG ATG GTC GTT TTA TTA TGC CAA GTC CTG CTA GGA GGC GCG 
Gly Asn Arg MET Leu MET Val Val Leu Leu Cys Gin Val Leu Leu Gly Gly Ala 

477 492 507 

AGC CAT GCT AGT TTG ATA CCT GAG ACG GGG AAG AAA AAA GTC GCC GAG ATT CAG 
Ser His Ala Ser Leu He Pro Glu Thr Gly Lys Lys Lys Val Ala Glu He Gin 

522 537 552 567 

r,?, G ? G °? A GGA CGC CGC TCA GGG CAG AGC ^ °AG CTC CTG CGG GAC TTC 

Gly His Ala Gly Gly Arg Arg Ser Gly Gift Ser His Glu Leu Leu Arg Asp Phe 

582 597 612 627 

^ AG GCG m CA 0X1 CTG CAG ATG TTT GGG CTG CGC CGC CGC CCG CAG CCT AGC AAG 
Glu Ala Thr Leu Leu Gin MET Phe Gly Leu Arg Arg Arg Pro Gin Pro Ser Lys 

6 *2 657 672 

AGT GCC GTC ATT CCG GAC TAC ATG CGG GAT CTT TAC CGG CTT CAG TCT GGG GAG 
Ser Ala Val He Pro Asp Tyr MET Arg Asp Leu Tyr Arg Leu Gin Ser Gly Glu 

687 702 717 732 

GAG GAG GAA GAG CAG ATC CAC AGC ACT GGT CTT GAG TAT CCT GAG CGC CCG GCC 
Glu Glu Glu Glu Gin He His Ser Thr Gly Leu Glu Tyr Pro Glu Arg Pro Ala 

747 762 777 

AGC CGG GCC AAC ACC GTG AGG AGC TTC CAC CAC GAA GAA CAT CTG GAG AAC ATC 
Ser Arg Ala Asn Thr Val Arg Ser Phe His His Glu Glu His Leu Glu Asn He 

792 807 822 8 37 

CCA GGG ACC AGT GAA AAC TCT GCT TTT CGT TTC CTC TTT AAC CTC AGC AGC ATC 

Pro Gly Thr Ser Glu Asn Ser Ala Phe Arg Phe Leu Phe Asn Leu Ser Ser He 
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Figure 2A 

852 867 882 897 

CCT GAG AAC GAG GTG ATC TCC TCT GCA GAG CTT CGG CTC TTC CGG GAG CAG GTG 
Pro Glu Asn Glu Val lie Ser Ser Ala Glu Leu Arg Leu Phe Arg Glu Gin Val 

9 12 927 942 

ASD Gin 2?S ^ T ^ ^ GG °? C TTC CAC CGT ATA ATT TAT GAG GTT 

Asp Gin Gly Pro Asp Trp Glu Arg Gly Phe His Arg He Asn He Tyr Glu Val 

957 97 2 987 io02 

ATG AAG CCC CCA GCA GAA GTG GTG CCT GGG CAC CTC ATC ACA CGA CTA CTC GAC 
MET Lys Pro Pro Ala Glu Val Val Pro Gly His Leu He Thr Arg Leu Leu Asp 

1017 1032 1047 

ACG AGA CTG GTC CAC CAC AAT GTG ACA CGG TGG GAA ACT TTT GAT GTG AGC CCT 
Thr Arg Leu Val His His Asn Val Thr Arg Trp Glu Thr Phe Asp Val Ser Pro 

1062 1077 1092 n 0 7 

GCG GTC CTT CGC TGG ACC CGG GAG AAG CAG CCA AAC TAT GGG CTA GCC ATT GAG 

Ala Val Leu Arg Trp Thr Arg Glu Lys Gin Pro Asn Tyr Gly Leu Ala He Glu 

1122 H37 H52 1167 

S T ? tu T 5*° CTC CAT ^ ACT CGG ACC « c CAG GGC C AG CAT GTC AGG ATT AGC 
Val Thr His Leu His Gin Thr Arg Thr His Gin Gly Gin His Val Arg He Ser 

1182 H97 1212 

CGA TCG TTA CCT CAA GGG AGT GGG AAT TGG GCC CAG CTC CGG CCC CTC CTG GTC 
Arg Ser Leu Pro Gin Gly Ser Gly Asn Trp Ala Gin Leu Arg Pro Leu Leu Val 

1227 1242 1257 1272 

ACC TTT GGC CAT GAT GGC CGG GGC CAT GCC TTG ACC CGA CGC CGG AGG GCC AAG 
Thr Phe Gly His Asp Gly Arg Gly His Ala Leu Thr Arg Arg Arg Arg Ala Lys 

1287 1302 1317 

CGT AGC CCT AAG CAT CAC TCA CAG CGG GCC AGG AAG AAG AAT AAG AAC TGC CGG 
Arg Ser Pro Lys His His Ser Gin Arg Ala Arg Lys Lys Asn Lys Asn Cys Arg 

1332(311) 1347 i 36 2 1377 

CGC CAC TCG CTC TAT GTG GAC TTC AGC GAT GTG GGC TGG AAT GAC TGG ATT GTG 

Arg His Ser Leu Tyr Val Asp Phe Ser Asp Val Gly Trp Asn Asp Trp He Val 

1392 1407 1422 1437 

GCC CCA CCA GGC TAC CAG GCC TTC TAC TGC CAT GGG GAC TGC CCC TTT CCA CTG 
Ala Pro Pro Gly Tyr Gin Ala Phe Tyr Cys His Gly Asp Cys Pro Phe Pro Leu 

1452 1467 1482 

GCT GAC CAC CTC AAC TCA ACC AAC CAT GCC ATT GTG CAG ACC CTG GTC AAT TCT 
Ala Asp His Leu Asn Ser Thr Asn His Ala He Val Gin Thr Leu Val Asn Ser 

1497 1512 1527 1542 

GTC AAT TCC AGT ATC CCC AAA GCC TGT TGT GTG CCC ACT GAA CTG AGT GCC ATC 
Val Asn Ser Ser He Pro Lys Ala Cys Cys Val Pro Thr Glu Leu Ser Ala He 
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Figure 2B 

1^57 1572 n ... 

Masassssssaaasfiaas 

SBS? 5? SK «ftf - * fifi—SK ^ GGATAGACAG 

1666 1676 lfififi 1CQ£ 

ATATACACAC CACACACACA CACCACA^AC A^CACA^CACGTT^Ca'tccA^CACACA^ 

ACAGAcS TCCTTA^C TGGACtS TTTAAAAAwf ^AAAAAAAwf AATGGAAMjf ATCCCTAM.C 6 

ATTCAcSg ACCTTATTTA TGACTTTAG3 TGCAAA^ TTGACCaVat' TGATCAT^TTTTGACaIa 6 
1876 1886 1896 

ATATATTTAT AACTACGTAT TAAAAGAAAA AAATAA^ AGTCA^aS TTAAAAAAJU AAAAAAAAn^ 
1946 

CTAGAGTCGA CGGAATTC 
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Figure 3 



TGA ACA AGA GAG TGC TCA AGA AGC TGT CCA AGG ACG GCT CCA CAG AGG 48 

* Thr Arg Glu cys Ser Arg Ser Cys Pro Arg Thr Ala Pro Gin Arg 
-41 -40 -35 -30 

CAG GTG AGA GCA GTC ACG AGG AGG ACA CGG ATG GCG CAC GTG GCT GCG 96 
Gin Val Arg Ala Val Thr Arg Arg Thr Arg Met Ala His Val Ala Ala 
-25 -20 -15 -10 

GGG TCG ACT TTA GCC AGG CGG AAA AGG AGC GCC GGG GCT GGC AGC CAC 144 
Gly Ser Thr Leu Ala Arg Arg Lys Arg Ser Ala Gly Ala Gly Ser His 
-5 l s 5 

TGT CAA AAG ACC TCC CTG CGG GTA AAC TTC GAG GAC ATC GGC TGG GAC 192 
Cys Gin Lys Thr Ser Leu Arg Val Asn Phe Glu Asp He Gly Trp Asp 
10 15 20 

AGC TGG ATC ATT GCA CCC AAG GAG TAT GAA GCC TAC GAG TGT AAG GGC 240 
Ser Trp He He Ala Pro Lys Glu Tyr Glu Ala Tyr Glu cys Lys Gly 
25 30 35 

GGC TGC TTC TTC CCC TTG GCT GAC GAT GTG ACG CCG ACG AAA CAC GCT 288 
Gly Cys Phe Phe Pro Leu Ala Asp Asp Val Thr Pro Thr Lys His Ala 
40 45 50 55 

ATC GTG CAG ACC CTG GTG CAT CTC AAG TTC CCC ACA AAG GTG GGC AAG 336 
He Val Gin Thr Leu Val His Leu Lys Phe Pro Thr Lys Val Gly Lys 
60 65 70 

GCC TGC TGT GTG CCC ACC AAA CTG AGC CCC ATC TCC GTC CTC TAC AAG 384 
Ala Cys Cys Val Pro Thr Lys Leu Ser Pro He Ser Val Leu Tyr Lys 

75 80 85 

GAT GAC ATG GGG GTG CCC ACC CTC AAG TAC CAT TAC GAG GGC ATG AGC 432 
Asp Asp Met Gly Val Pro Thr Leu Lys Tyr His Tyr Glu Gly Met Ser 
90 95 100 

GTG GCA GAG TGT GGG TGC AGG TAGTATCTGC CTGCGGG 470 
Val Ala Glu Cys Gly cys Arg 
105 110 
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